C8 and C10 medium-chain thioesterases in plants

ABSTRACT

By this invention, further plant medium-chain acyl-ACP thioesterases are provided, as well as uses of long-chain thioesterase sequences in conjunction with medium-chain thioesterase sequences. In a first embodiment, this invention relates to particular medium-chain thioesterase sequences from elm and Cuphea, and to DNA constructs for the expression of these thioesterases in host cells for production of C8 and C10 fatty acids. Other aspects of this invention relate to methods for using plant medium-chain thioesterases or medium-chain thioesterases from non-plant sources to provide medium-chain fatty acids in plant cells. As a further aspect, uses of long-chain thioesterase sequences for anti-sense methods in plant cells in conjunction with expression of medium-chain thioesterases in plant cells is described.

This application is a continuation-in-part of U.S. Ser. No. 07/968,971 filed Oct. 30, 1992 now U.S. Pat. No. 5,455,167.

TECHNICAL FIELD

The present invention is directed to amino acid and nucleic acid sequences and constructs, and methods related thereto.

BACKGROUND

Members of several plant families synthesize large amount of predominantly medium-chain (C8-C14) triacylglycerols in specialized storage tissues, some of which are harvested for production of important dietary or industrial medium-chain fatty acids (F. D. Gunstone, The Lipid Handbook (Chapman & Hall, New York, 1986) pp. 55-112). Laurate (C12:0), for example, is currently extracted from seeds of tropical trees at a rate approaching one million tons annually (Battey, et al., Tibtech (1989) 71:122-125).

The mechanism by which the ubiquitous long-chain fatty acid synthesis is switched to specialized medium-chain production has been the subject of speculation for many years (Harwood, Ann. Rev. Plant Physiol. Plant Mol. Biology (1988) 39:101-138). Recently, Pollard, et al., (Arch. of Biochem. and Biophys. (1991) 284:1-7) identified a medium-chain acyl-ACP thioesterase activity in developing oilseeds of California bay, Umbellularia californica. This activity appears only when the developing cotyledons become committed to the near-exclusive production of triglycerides with lauroyl (12:0) and caproyl (10:0) fatty acids. This work presented the first evidence for a mechanism for medium-chain fatty acid synthesis in plants: During elongation the fatty acids remain esterified to acyl-carrier protein (ACP). If the thioester is hydrolized prematurely, elongation is terminated by release of the medium-chain fatty acid. The Bay thioesterase was subsequently purified by Davies et al., (Arch. Biochem. Biophys. (1991) 290:37-45) which allowed the cloning of a corresponding cDNA which has been used to obtain related clones and to modify the triglyceride composition of plants (WO 91/16421 and WO 92/20236).

SUMMARY OF THE INVENTION

By this invention, further plant medium-chain thioesterases, and uses of plant long-chain thioesterase antisense sequences are provided. In addition, uses of medium-chain thioesterases from non-plant sources are considered.

In a first embodiment, this invention is directed to nucleic acid sequences which encode plant medium-chain preferring thioesterases, in particular those which demonstrate preferential activity towards fatty acyl-ACPs having a carbon chain length of C8 or C10. This includes sequences which encode biologically active plant thioesterases as well as sequences which are to be used as probes, vectors for transformation or cloning intermediates. Biologically active sequences are preferentially found in a sense orientation with respect to transcriptional regulatory regions found in various constructs. The plant thioesterase encoding sequences may encode a complete or partial sequence depending upon the intended use. The instant invention pertains to the entire or portions of the genomic sequence or cDNA sequence and to the thioesterase protein encoded thereby, including precursor or mature plant thioesterase. Plant thioesterases exemplified herein include a Cuphea hookeriana (Cuphea) and an Ulmacea (elm) thioesterase. The exemplified thioesterase sequences may also be used to obtain other similar plant thioesterases.

Of special interest are recombinant DNA constructs which can provide for the transcription or transcription and translation (expression) of the plant thioesterase sequence. In particular, constructs which are capable of transcription or transcription and translation in plant host cells are preferred. Such construct may contain a variety of regulatory regions including transcriptional initiation regions obtained from genes preferentially expressed in plant seed tissue.

In a second aspect, this invention relates to the presence of such constructs in host cells, especially plant host cells, and to a method for producing a plant thioesterase in a host cell or progeny thereof via the expression of a construct in the cell. In a related aspect, this invention includes transgenic host cells which have an expressed plant thioesterase therein.

In a different embodiment, this invention relates to methods of using a DNA sequence encoding a plant thioesterase for the modification of the proportion of free fatty acids produced within a cell, especially plant cells. Plant cells having such a modified free fatty acid composition are also contemplated herein.

Methods to further increase the medium-chain fatty acid content of plant seed oils from plants engineered to contain medium-chain acyl-ACP thioesterase are provided in an additional embodiment. In particular use of antisense sequences associated with plant long-chain thioesterases are used to decrease the native plant long-chain thioesterases, thus providing greater substrate availability for the medium-chain thioesterase.

Other aspects of this invention relate to methods for using a plant medium-chain thioesterase. Expression of a plant medium-chain thioesterase in a bacterial cell to produce medium-chain fatty acids is provided. By this method, quantities of such fatty acids may be harvested from bacteria. Exemplified in the application is the use of E. coli expressing elm and Cuphea thioesterases; the fadD E. coli mutant is preferred in some applications. In addition, temperature ranges for improved medium-chain fatty acid production are described.

Similarly, non-plant enzymes having medium-chain acyl-ACP thioesterase activity are useful in the plant and bacteria expression methods discussed. In particular, an acyl transferase from Vibrio harvei, is useful in applications for production of C14 medium-chain fatty acids.

Methods to produce an unsaturated medium-chain thioesterase by the use of a plant medium-chain thioesterase are also described herein. It is now found that, even in plants which exclusively produce and incorporate quantities of saturated medium-chain acyl-ACP fatty acids into triglycerides, the thioesterase may have activity against unsaturated fatty acids of the same length.

DESCRIPTION OF THE FIGURES

FIG. 1. The nucleic acid sequence and translated amino acid sequence of a bay C12:0-ACP thioesterase cDNA clone (SEQ ID NO:1), are provided.

FIG. 2. The nucleic acid sequence and translated amino acid sequence of an elm C10:0-ACP thioesterase partial cDNA clone (SEQ ID NO:2), are provided.

FIG. 3. DNA sequence of a PCR fragment of a Cuphea thioesterase gene (SEQ ID NO:3), is presented. Translated amino acid sequence in the region corresponding to the Cuphea thioesterase gene is also shown.

FIG. 4. DNA sequences of C. hookeriana C93A PCR fragments from clones 14-2 (SEQ ID NO:4) and 14-9 (SEQ ID NO:5), are provided.

FIG. 5. Preliminary DNA sequence and translated amino acid sequence from the 5' end of a Cuphea hookeriana thioesterase (CUPH-1) cDNA clone (SEQ ID NO:6), is shown.

FIG. 6. The entire nucleic acid sequence and the translated amino acid sequence of a full length Cuphea hookeriana thioesterase (CUPH-1) cDNA clone, CMT9 (SEQ ID NO:7), is shown.

FIG. 7. The nucleic acid sequence and the translated amino acid sequence of a full length Cuphea hookeriana thioesterase (CUPH-2) cDNA clone, CMT7 (SEQ ID NO:8), is shown.

FIG. 8. The nucleic acid sequence of a Cuphea hookeriana thioesterase cDNA clone, CMT13 (SEQ ID NO:9), is shown.

FIG. 9. The nucleic acid sequence a of a Cuphea hookeriana thioesterase cDNA clone, CMT10 (SEQ ID NO:10), is shown.

FIG. 10. The nucleic acid sequence and translated amino acid sequence of a Cuphea hookeriana thioesterase cDNA clone, CLT7 (SEQ ID NO:11), is shown.

FIG. 11. Nucleic acid sequence and translated amino acid sequence of a Brassica campestris long-chain acyl ACP thioesterase clone (SEQ ID NO:12), is shown.

DETAILED DESCRIPTION OF THE INVENTION

Plant thioesterases, including medium-chain plant thioesterases are described in WO 91/16421 (PCT/US91/02960), WO 92/20236 (PCT/US92/04332) and U.S. Ser. No. 07/824,247 which are hereby incorporated by reference in their entirety.

A plant medium-chain thioesterase of this invention includes any sequence of amino acids, peptide, polypeptide or protein obtainable from a plant source which demonstrates the ability to catalyze the production of free fatty acid(s) from C8-C14 fatty acyl-ACP substrates under plant enzyme reactive conditions. By "enzyme reactive conditions" is meant that any necessary conditions are available in an environment (i.e., such factors as temperature, pH, lack of inhibiting substances) which will permit the enzyme to function. Of particular interest in the instant application are C8 and C10 preferring acyl-ACP thioesterases obtainable from Cuphea hookeriana and elm (an Ulmus species).

Plant thioesterases are obtainable from the specific exemplified sequences provided herein and from related sources. For example, several species in the genus Cuphea accumulate triglycerides containing medium-chain fatty acids in their seeds, e.g., procumbens, lutea, hookeriana, hyssopifolia, wrightii and inflata. Another natural plant source of medium-chain fatty acids are seeds of the Lauraceae family: e.g., Pisa (Actinodophne hookeri) and Sweet Bay (Laurus nobilis). Other plant sources include Myristicaceae, Simarubaceae, Vochysiaceae, and Salvadoraceae, and rainforest species of Erisma, Picramnia and Virola, which have been reported to accumulate C14 fatty acids.

As noted above, plants having significant presence of medium-chain fatty acids therein are preferred candidates to obtain naturally-derived medium-chain preferring plant thioesterases. However, it should also be recognized that other plant sources which do not have a significant presence of medium-chain fatty acids may be readily screened as other enzyme sources. In addition, a comparison between endogenous medium-chain preferring plant thioesterases and between longer and/or shorter chain preferring plant thioesterases may yield insights for protein modeling or other modifications to create synthetic medium-chain preferring plant thioesterases as well as discussed above.

Additional enzymes having medium-chain acyl-ACP thioesterase activity are also described herein which are obtained from non-plant sources, but which may be modified and combined with plant sequences for use in constructs for plant genetic engineering applications. Furthermore, such sequences may be used for production of medium-chain fatty acids in procaryotic cells, such as described herein for bay thioesterase.

One skilled in the art will readily recognize that antibody preparations, nucleic acid probes (DNA and RNA) and the like may be prepared and used to screen and recover "homologous" or "related" thioesterases from a variety of plant sources. For immunological screening methods, antibody preparations either monoclonal or polyclonal are utilized. For detection, the antibody is labeled using radioactivity or any one of a variety of second antibody/enzyme conjugate systems that are commercially available. Examples of some of the available antibody detection systems are described by Oberfilder (Focus (1989) BRL Life Technologies, Inc., 11:1-5).

Homologous sequences are found when there is an identity of sequence, which may be determined upon comparison of sequence information, nucleic acid or amino acid, or through hybridization reactions between a known thioesterase and a candidate source. Conservative changes, such as Glu/Asp, Val/Ile, Ser/Thr, Arg/Lys and Gln/Asn may also be considered in determining amino acid sequence homology. Amino acid sequences are considered homologous by as little as 25% sequence identity between the two complete mature proteins. (See generally, Doolittle, R. F., OF URFS and ORFS (University Science Books, Calif., 1986.) Typically, a lengthy nucleic acid sequence may show as little as 50-60% sequence identity, and more preferably at least about 70% sequence identity, between the target sequence and the given plant thioesterase of interest excluding any deletions which may be present, and still be considered related.

A genomic or other appropriate library prepared from the candidate plant source of interest may be probed with conserved sequences from plant thioesterase to identify homologously related sequences. Shorter probes are often particularly useful for polymerase chain reactions (PCR), especially when highly conserved sequences can be identified.

When longer nucleic acid fragments are employed (>100 bp) as probes, especially when using complete or large cDNA sequences, one would screen with low stringencies (for example 40°-50° C. below the melting temperature of the probe) in order to obtain signal from the target sample with 20-50% deviation, i.e., homologous sequences. (See, Beltz, et al. Methods in Enzymology (1983) 100:266-285.).

Using methods known to those of ordinary skill in the art, a DNA sequence encoding a plant medium-chain thioesterase can be inserted into constructs which can be introduced into a host cell of choice for expression of the enzyme, including plant cells for the production of transgenic plants. Thus, potential host cells include both prokaryotic and eukaryotic cells. A host cell may be unicellular or found in a multicellar differentiated or undifferentiated organism depending upon the intended use. Cells of this invention may be distinguished by having a plant thioesterase foreign to the wild-type cell present therein, for example, by having a recombinant nucleic acid construct encoding a plant thioesterase therein.

Also, depending upon the host, the regulatory regions will vary, including regions from viral, plasmid or chromosomal genes, or the like. For expression in prokaryotic or eukaryotic microorganisms, particularly unicellular hosts, a wide variety of constitutive or reulatable promoters may be employed. Among transcriptional initiation regions which have been described are regions from bacterial and yeast hosts, such as E. coli, B. subtilis, Sacchromyces cerevisiae, including genes such as beta-galactosidase, T7 polymerase, tryptophan E and the like.

For the most part, when expression in a plant host cell is desired, the constructs will involve regulatory regions (promoters and termination regions) functional in plants. The open reading frame, coding for the plant thioesterase or functional fragment thereof will be joined at its 5' end to a transcription initiation regulatory region such as the wild-type sequence naturally found 5' upstream to the thioesterase structural gene. Numerous other transcription initiation regions are available which provide for a wide variety of constitutive or regulatable, e.g., inducible, transcription of the structural gene functions. Among transcriptional initiation regions used for plants are such regions associated with the structural genes such as for CaMV 35S and nopaline and mannopine synthases, or with napin, ACP promoters and the like. The transcription/translation initiation regions corresponding to such structural genes are found immediately 5' upstream to the respective start codons. If a particular promoter is desired, such as a promoter native to the plant host of interest or a modified promoter, i.e., having transcription initiation regions derived from one gene source and translation initiation regions derived from a different gene source, including the sequence encoding the plant thioesterase of interest, or enhanced promoters, such as double 35S CaMV promoters, the sequences may be joined together using standard techniques. For most applications desiring the expression of medium-chain thioesterases in plants, the use of seed specific promoters are preferred.

It is noted that such constructs have been successfully used in genetic engineering applications to produce C12 (laurate) in plants which do not normally contain such medium-chain fatty acids (WO 91/16421). In particular, a bay C12 preferring acyl-ACP thioesterase was expressed in Brassica and Arabidopsis plants. Seeds from the resulting plants were observed to contain up to 50 mole percent laurate in the seed oils (WO 92/20236).

A further genetic engineering approach to increase the medium-chain fatty acid production in such transgenic plants utilizes antisense sequence of the native long-chain thioesterase in the target host plant. In this manner, the amount of long-chain thioesterase is decreased. As a result, the introduced medium-chain thioesterase has increased available substrate and the content of medium-chain fatty acids produced may be similarly increased.

Other genetic engineering approaches to increase medium-chain fatty acids would include insertion of additional DNA sequence encoding plant thioesterase structural genes into cells, use of transcriptional initiation regions evidencing higher mRNA copy numbers or an improved timing specificity profile which corresponds better to the availability of substrate, for example. For example, analysis of the time course of laurate production, under regulatory control of a napin promoter, in seeds of a Brassica plant demonstrates that the appearance of medium-chain trioesterase activity lags behind the onset of storage oil synthesis by approximately 5-7 days. Calculations show that about 20% of the total fatty acids are already synthesized before the medium-chain thioesterase makes significant impact. Thus, substantially higher medium-chain fatty acid levels (10-20%) might be obtained if the thioesterase gene is expressed at an earlier stage of embryo development

Additionally, means to increase the efficiency of translation may include the use of the complete structural coding sequence of the medium-chain thioesterase gene. Thus, use of the complete 5'-region of the medium-chain thioesterase coding sequence may improve medium-chain fatty acid production.

When a plant medium-chain thioesterase is expressed in a bacterial cell, particularly in a bacterial cell which is not capable of efficiently degrading fatty acids, an abundance of medium-chain fatty acids can be produced and harvested from the cell. Similarly, over production of non-plant enzymes having acyl-ACP thioesterase activity is also useful for production of medium-chain fatty acids in E. coli. In some instances, medium-chain fatty acid salts form crystals which can be readily separated from the bacterial cells. Bacterial mutants which are deficient in acyl-CoA synthase, such as the E. coli fadD and fadE mutants, may be employed.

In studies with bay thioesterase, growth of fadD bay thioesterase transformants relative to the vector transformed control was severely retarded at 37° C., and less so at 25°-30° C. Liquid cultures growing at the lower temperatures accumulated a precipitate and colonies formed on petri dishes at 25° C. deposit large quantities of laurate crystals, especially at the surface. These deposits, as identified by FAB-mass spectrometry were identified as laurate. An abnormal growth rate phenotype is also noted in E. coli cells expressing an elm medium-chain preferring acyl-ACP thioesterase. At 37° C., the elm thioesterase appears to be toxic to the cells, and at 25° C. or 30° C. the cells grow much more slowly than control non-transformed cells. It has been noted with both bay and elm thioesterase-expressing E. coli cells that variants which grow at the same rate as control cells at 25° C. or 30° C. may be selected when the transformed cells are grown for several generations. In addition, when a bay thioesterase-expressing normal growth phenotype variant is cured of the bay thioesterase encoding plasmid and retransformed with a similar plasmid containing the elm thioesterase expression construct, the elm thioesterase expressing cells exhibit a normal growth phenotype in the first generation of cells comprising the construct. Similarly, myristate crystals are produced in fadD E. coli transformants expressing a Vibrio C14 thioesterase gene. In this instance the growth temperature does not significantly effect cell growth or myristate production. After separation and quantitation by gas chromatography, it is estimated that the laurate crystals deposited by the fadD-bay thioesterase transformants on petri dises represented about 30-100% of the total dry weight of the producing bacteria.

When expression of the medium-chain thioesterase is desired in plant cells, various plants of interest include, but are not limited to, rapeseed (Canola and High Erucic Acid varieties), sunflower, safflower, cotton, Cuphea, soybean, peanut, coconut and oil palms, and corn. Depending on the method for introducing the recombinant constructs into the host cell, other DNA sequences may be required. Importantly, this invention is applicable to dicotyledyons and monocotyledons species alike and will be readily applicable to new and/or improved transformation and regulation techniques.

In any event, the method of transformation is not critical to the instant invention; various methods of plant transformation are currently available. As newer methods are available to transform crops, they may be directly applied hereunder. For example, many plant species naturally susceptible to Agrobacterium infection may be successfully transformed via tripartite or binary vector methods of Agrobacterium mediated transformation. In addition, techniques of microinjection, DNA particle bombardment, electroporation have been developed which allow for the transformation of various monocot and dicot plant species.

The medium-chain fatty acids produced in the transgenic host cells of this invention are useful in various commercial applications. For example, C12 and C14 are used extensively in the detergent industry. C8 and C10 fatty acids are used as lubricants, for example in jet engines. C8 and C10 fatty acids also find use in high performance sports foods and in low calorie food applications.

The following examples are provided by way of illustration and not by limitation.

EXAMPLES Example 1

Sources of Plant C8 and C10 Acyl-ACP Thioesterases

Discovery of a C10 preferring acyl-ACP thioesterase activity in developing seeds from Cuphea hookeriana is described in WO 91/16421. Other plants may also be sources of desirable thioesterases which have preferences for fatty acyl chain lengths of C8 or C10. Such additional plant thioesterases may be identified by analyzing the triacylglyceride composition of various plant oils and the presence of a specific thioesterase confirmed by assays using the appropriate acyl-ACP substrate. The assay for C10 preferring acyl-ACP thioesterase, as described for example in WO 91/16421, may be used for such analyses.

For example, other plants which are now discovered to have desirable thioesterase enzymes include elm (Ulmaceae) and coconut (Cocos nucifera). A significant percentage of 10:0 fatty acids are detected in elm seeds, and both 8:0 and 10:0 fatty acids are prominent in seeds from coconut. Results of biochemical assays to test for thioesterase activity in developing embryos from elm and coconut are presented below in Table 1.

                  TABLE 1                                                          ______________________________________                                                     Activity                                                                       (mean cpm in ether extract)                                        Substrate     elm      coconut                                                 ______________________________________                                          8:0-ACP      84       784                                                     10:0-ACP      2199     1162                                                    12:0-ACP      383      1308                                                    14:0-ACP      1774     573                                                     16:0-ACP      3460     902                                                     18:1-ACP      3931     2245                                                    ______________________________________                                    

With elm, a peak of thioesterase activity is seen with the C10:0-ACP substrate, in addition to significant activity with longer-chain substrates. This evidence suggests that a thioesterase with specific activity towards C10:0-ACP substrate is present in elm embryos. With coconut, endosperm thioesterase activity is seen with C8:0, C10:0, C12:0 and C14:0 medium-chain substrates, as shown in Table 6. These activities accord with the considerable C8:0, C10:0, C12:0, and C14:0 fatty acyl contents of the endosperm oil suggesting that one or more thioesterases with activity on these medium chain acyl-ACPs are present in coconut endosperm and responsible for medium chain formation therein.

Example 2

Acyl-ACP Thioesterase cDNA Sequences

A. Bay

Sequence of a full length bay C12 preferring acyl-ACP cDNA clone, pCGN3822, (3A-17), is presented in FIG. 1.

The N-terminal sequence of the mature bay thioesterase, isolated from the developing seeds, has been reported as beginning at amino acid residue 84 of the derived protein sequence (WO 92/20236). The remaining N-terminal amino acids would therefore be expected to represent sequence of a transit peptide. This 83 amino acid sequence has features common to plastid transit peptides, which are usually between 40 and 100 amino acids long (Keegstra et al., Ann. Rev. Plant Physiol. and Plant Mol. Biol. (1989) 40:471-501). A hydropathy plot of this transit peptide region reveals a hydrophobic domain at each end of the transit sequence. Other transit peptide sequences have been shown to contain similar hydrophobic N-terminal domains. The significance of this N-terminal domain is not known, but certain experiments suggest that lipid-mediated binding may be important for plastid import of some proteins (Friedman and Keegstra, Plant Physiol. (1989) 89:993-999). As to the C-terminal domain, comparison of hydropathy plots of known imported chloroplastic stromal protein transit peptides (Keegstra et al, supra) indicates that these transit peptides do not have a hydrophobic domain at the C-terminus. However, preproteins destined to the thylakoid lumen of the chloroplast have an alanine-rich hydrophobic domain at the C-terminal end of their transit peptides (Smeekens et al., TIBS (1990) 15:73-76). The existence of such a domain in the transit sequence of the bay thioesterase might suggest that it has a double-domain transit peptide targeting this enzyme to the lumen of the thylakoid equivalent or to the intermembrane space. This is unexpected, since the substrate, acyl-ACP, has been detected in the stroma (Ohlrogge et al., Proc. Nat. Acad. Sci. (1979) 76:1194-1198). An alternative explanation for the existence of such a domain in the bay thioesterase preprotein is that it may represent a membrane anchor of the mature protein that is cleaved upon purification, leading to a sequence determination of an artificial N-terminus. The in vivo N-terminus of the mature thioesterase protein would then lie at a location further upstream than indicated by amino acid sequence analysis.

Analysis of additional plant medium-chain acyl-ACP thioesterase sequences, such as those encoded by the elm and Cuphea clones described herein, indicates extensive homology in the region initially identified as the C-terminal domain of the bay C12 preferring acyl-ACP thioesterase transit peptide. It is thus possible that this postulated transit peptide "C-terminal domain" in fact represents a further N-terminal region of the mature bay thioesterase. In such a case, the leucine residue indicated as amino acid number 60 in FIG. 1 is a candidate for the N-terminus of the mature bay C12 thioesterase protein. Western analysis of transgenic Brassica plants expressing the bay C12 thioesterase protein reveals a protein band of approximately 41 kD, which size is consistent with the suggestion that the mature protein N-terminus is located at or near the leucine residue, amino acid number 60.

Gene bank searches with the derived amino acid sequences of plant medium-chain preferring acyl-ACP thioesterases do not reveal significant matches with any entry, including the vertebrate medium-chain acyl-ACP thioesterase II (Naggert et al., Biochem. J. (1987) 243:597-601). Also, the plant medium-chain preferring acyl-ACP thioesterases do not contain a sequence resembling the fatty acid synthetase thioesterase active-site motif (Aitken, 1990 in Identification of Protein Concensus Sequences, Active Site Motifs, Phosphorylation and other Post-translational Modifications (Ellis Horwood, Chichester, West Sussex, England, pp. 40-147).

B. Cuphea

DNA sequence encoding a portion of a Cuphea hookeriana thioesterase protein (FIG. 3) may be obtained by PCR as described in WO 92/20236.

Additional DNA sequences corresponding to Cuphea thioesterase peptide regions are obtained by PCR using degenerate olgonucleotides designed from peptide fragments from conserved regions of plant thioesterases described in WO 92/20236. A forward primer, TECU9, contains 17 nucleotides corresponding to all possible coding sequences for amino acids 176-181 of the bay and camphor thioesterase proteins. A reverse primer, TECU3A, contains 18 nucleotides corresponding to the complement of all possible coding sequences for amino acids 283-288 of the bay and camphor thioesterase proteins, In addition, the forward and reverse primers contain BamHI or XhoI restriction sites, respectively, at the 5' end, and the reverse primer contains an inosine nucleotide at the 3' end. The safflower, bay and camphor sequences diverge at two amino acid positions in the forward primer region, and at one amino acid residue in the reverse primer region. The degeneracy of oligonucleotide primers is such that they could encode the safflower, bay and camphor sequences.

Polymerase chain reaction samples (100 μl) are prepared using reverse transcribed Cuphea hookeriana RNA as template and 1 μM of each of the oligonucleotide primers. PCR products are analyzed by agarose gel electrophoresis, and an approximately 300 bp DNA fragment, the predicted size from the thioesterase peptide sequences, is observed. The DNA fragment, designated C93A (Cuphea) is isolated and cloned into a convenient plasmid vector using the PCR-inserted BamHI and XhoI restriction digest sites. DNA sequence of representative clones is obtained. Analysis of these sequences indicates that at least two different, but homologous Cuphea hookeriana cDNAs were amplified. The DNA sequences of two Cuphea PCR fragments, 14-2 and 14-9, are presented in FIG. 4.

Total RNA for cDNA library construction may be isolated from developing Cuphea embryos by modifying the DNA isolation method of Webb and Knapp (Plant Mol. Biol. Reporter (1990) 8:180-195). Buffers include:

REC: 50 mM TrisCl pH 9, 0.7M NaCl, 10 mM EDTA pH8, 0.5% CTAB.

REC+: Add B-mercaptoethanol to 1% immediately prior to use.

RECP: 50 mM TrisCl pH9, 10 mM EDTA pH8, and 0.5% CTAB.

RECP+: Add B-mercaptoethanol to 1% immediately prior to use.

For extraction of 1 g of tissue, 10 ml of REC+ and 0.5 g of PVPP is added to tissue that has been ground in liquid nitrogen and homogenized. The homogenized material is centrifuged for 10 min at 1200 rpm. The supernatant is poured through miracloth onto 3 ml cold chloroform and homogenized again. After centrifugation, 12,000 RPM for 10 min, the upper phase is taken and its volume determined. An equal volume of RECP+ is added and the mixture is allowed to stand for 20 min. at room temperature. The material is centrifuged for 20 min. at 10,000 rpm twice and the supernatant is discarded after each spin. The pellet is dissolved in 0.4 ml of 1M NaCl (DEPC) and extracted with an equal volume of phenol/chloroform. Following ethanol precipitation, the pellet is dissolved in 1 ml of DEPC water. Poly (A) RNA may be isolated from this total RNA according to Maniatis et al. (Molecular Cloning: A Laboratory Manual (1982) Cold Springs Harbor, N.Y.). cDNA libraries may be constructed in commercially available plasmid or phage vectors.

The thioesterase encoding fragments obtained by PCR as described above are labeled and used to screen Cuphea cDNA libraries to isolate thioesterase cDNAs. Preliminary DNA sequence of a Cuphea cDNA clone TAA 342 is presented in FIG. 5. Translated amino acid sequence of the Cuphea clone from the presumed mature N-terminus (based on homology to the bay thioesterase) is shown.

The sequence is preliminary and does not reveal a single open reading frame in the 5' region of the clone. An open reading frame believed to represent the mature protein sequence is shown below the corresponding DNA sequence. The N-terminal amino acid was selected based on homology to the bay thioesterase protein.

Additional Cuphea cDNA clones were obtained by screening a cDNA library prepared using a Uni-ZAP (Stratagene) phage library cloning system. The library was screening using radiolabeled TAA 342 DNA. The library was hybridized at 42° C. using 30% formamide, and washing was conducted at low stringency (room temperature with 1× SSC, 0.1% SDS). Numerous thioesterase clones were identified and DNA sequences determined. Three classes of Cuphea cDNA clones have been identified. The original TAA 342 clone discussed above is representative of CUPH-1 type clones which have extensive regions of homology to other plant medium-chain preferring acyl-ACP thioesterases. Nucleic acid sequence and translated amino acid sequence of a CUPH-1 clone, CMT9, is shown in FIG. 6. The mature protein is believed to begin either at or near the leucine at amino acid position 88, or the leucine at amino acid position 112. From comparison of TAA 342 to CMT9, it is now believed that the TAA 342 sequence is missing a base which if present would shift the reading frame of the TAA 342 CUPH-1 clone to agree with the CUPH-1 thioesterase encoding sequence on CMT9. In particular, the stop codon for CUPH-1 is now believed to be the TAG triplet at nucleotides 1391-1393 of FIG. 5.

DNA sequence of an additional CUPH-1 clone, CMT10, is shown in FIG. 9. CMT10 has greater than 90% sequence identity with CMT9, but less than the approximately 99% sequence identity noted in fragments from other CUPH-1 type clones.

A second class of Cuphea thioesterase cDNAs is identified as CUPH-2. These cDNAs also demonstrate extensive homology to other plant medium-chain acyl-ACP thioesterases. Expression of a representative clone, CMT7, in E. coli (discussed in more detail below), indicates that CUPH-2 clones encode a medium-chain preferring acyl-ACP thioesterase protein having preferential activity towards C8 and C10 acyl-ACP substrates. DNA sequence and translated amino acid sequence of CMT7 is shown in FIG. 7.

Preliminary DNA sequence from the 5' end of an additional CUPH-2 clone, CMT13, is shown in FIG. 8. Although CMT13 demonstrates extensive sequence identity with CMT7, DNA sequence alignment reveals several gaps, which together total approximately 48 nucleotides, where the CMT13 clone is missing sequences present in the CMT7 clone.

DNA sequence analysis of a third class of Cuphea thioesterase cDNA clones indicates extensive homology at the DNA and amino acid level to 18:1 acyl-ACP thioesterases from Brassica (FIG. 11) and safflower (WO 92/20236). DNA sequence and translated amino acid sequence of a representative clone, CLT2, is shown in FIG. 10.

C. Elm

Elm acyl-ACP thioesterase clones may also be obtained using PCR primers for plant thioesterase sequences as discussed above for Cuphea. TECU9 and TECU3A are used in PCR reactions using reverse transcribed RNA isolated from elm embryos as template. As with Cuphea, an approximately 300 nucleotide fragment, E93A, is obtained and used to probe an elm cDNA library. Nucleic acid sequence and translated amino acid sequence of an elm medium-chain preferring acyl-ACP thioesterase clone are shown in FIG. 2. The clone encodes the entire mature elm thioesterase protein, but appears to be lacking some of the transit peptide encoding region. By comparison with other plant medium-chain acyl-ACP thioesterases, the mature elm protein is believed to begin either at the leucine indicated as amino acid number 54, or at the aspartate indicated as amino acid number 79.

Example 3

Expression of Acyl-ACP Thioesterases In E. coli

A. Expression of elm thioesterase.

An elm acyl-ACP thioesterase cDNA clone is expressed in E. coli as a lacZ fusion. The ULM1 cDNA clone, KA10, represented in FIG. 2 is digested with StuI and XbaI to produce an approximately 1000 base pair fragment containing the majority of the mature elm thioesterase encoding sequence. The StuI site is located at nucleotides 250-255 of the sequence shown in FIG. 2, and the XbaI site is located at nucleotides 1251-1256, 3' to the stop codon. As discussed above, the N-terminus for the mature elm thioesterase is believed to be either the leucine residue encoded by nucleotides 160-162 or the aspartate residue encoded by nucleotides 235-237. The StuI/XbaI fragment is inserted into StuI/XbaI digested pUC118 resulting in construct KA11. For expression analysis, KA11 is used to transform E. coli strain DH5 å or an E. coli mutant, fadD, which lacks medium-chain specific acyl-CoA synthetase (Overath et al., Eur. J. Biochem (1969) 7:559-574).

As has been observed with bay thioesterase constructs, E. coli clones expressing the elm thioesterase exhibited abnormal growth rate and morphology phenotypes. The growth rate of E. coli DH5 å (fadD⁺) or fadD mutant cells expressing the elm thioesterase is initially much slower than growth of control cells at either 25° C. or 30° C. At 37° C., the elm thioesterase plasmid appears to be toxic to the E. coli cells. After growing the transformed cultures for several generations, variants may be selected which grow at the same rate as control cells at 25° C. or 30° C. A similar result was seen with fadD cells comprising bay thioesterase expression constructs. A fadD mutant strain selected as having a normal growth rate when expressing the bay thioesterase was cured of the bay thioesterase construct and transformed with the elm thioesterase construct. This strain exhibits a normal growth phenotype in the first generation of cells comprising the elm thioesterase construct.

For thioesterase activity and fatty acid composition assays, a 25-50 ml culture of E. coli cells containing the elm thioesterase construct, and a similar culture of control cells are grown at 25° C. to an OD₆₀₀ of ˜0.5. Induction of the thioesterase expression may be achieved by the addition of IPTG to 0.4 mM followed by 1 or 2 hours further growth. For slow growing cultures, longer growth periods may be required following addition of IPTG.

A ten-ml aliquot of each culture (containing cells plus the culture medium) is assayed for specific activity towards C10:0-ACP and C16:0-ACP substrates as follows. Cells are harvested by centrifugation, resuspended in 0.5 ml assay buffer and lysed by sonication. Cell debris may be removed by further centrifugation. The supernant is then used in thioesterase activity assays as per Pollard et al., Arch. Biochem & Biophys. (1991) 281:306-312 using C10:0-ACP and C16:0-ACP substrates.

The activity assays from normal growth phenotype KA11 cells reproducibly demonstrate differentially elevated C10:0-ACP and C16:0-ACP hydrolysis activities. Upon induction with IPTG, the C10:0-ACP and C16:0-ACP activities are affected differently. The specific activity of the C16:0-ACP hydrolysis decreases slightly, while that of the C10:0-ACP hydrolase increases by approximately 44%. This data suggests that the C16:0-ACP hydrolysis activity is derived from the E. coli cells, rather than the elm thioesterase. As discussed in more detail below, a similar C16:0-ACP hydrolysis activity is detected in E. coli cells transformed with a Cuphea hookeriana thioesterase clone, CUPH-1.

For analysis of the fatty acid composition, a 4.5 ml sample of E. coli cells grown and induced as described above is transferred into a 15 ml glass vial with a teflon-lined cap. 100 μl of a 1 mg/ml standards solution containing 1 mg/ml each of C11:0 free fatty acid, C15:0 free fatty acid, and C17:0 TAG in 1:1 chloroform/methanol is added to the sample, followed by addition of 200 μl of glacial acetic acid and 10 ml of 1:1 chloroform/methanol. The samples are vortexed to mix thoroughly and centrifuged for 5 minutes at 1000 rpm for complete phase separation. The lower (chloroform) phase is carefully removed and transferred to a clean flask appropriate for use in a rotary evaporator (Rotovap). The sample is evaporated to near dryness. As medium-chain fatty acids appear to evaporate preferentially after solvent is removed, it is important to use just enough heat to maintain the vials at room temperature. The dried samples are methanolyzed by adding 1 ml of 5% sulfuric acid in methanol, transferring the samples to a 5 ml vial, and incubating the sample in a 90° C. water bath for 2 hours. The sample is allowed to cool, after which 1 ml of 0.9% NaCl and 300 μl of hexane are added. The sample is vortexed to mix thoroughly and centrifuged at 1000 rpm for 5 minutes. The top (hexane) layer is carefully removed and placed in a plastic autosampler vial with a glass cone insert, followed by capping of the vial with a crimp seal.

The samples are analyzed by gas-liquid chromatography (GC) using a temperature program to enhance the separation of components having 10 or fewer carbons. The temperature program used provides for a temperature of 140° C. for 3 minutes, followed by a temperature increase of 5° C./minute until 230° C. is reached, and 230° C. is maintained for 11 minutes. Samples are analyzed on a Hewlett-Packard 5890 (Palo Alto, Calif.) gas chromatograph. Fatty acid content calculations are based on the internal standards.

GC analysis indicates that the slow growing E. coli DH 5å cells expressing the elm thioesterase contained approximately 46.5 mole % C10:0 and 33.3 mole % C8:0 fatty acids as compared to fatty acid levels in control cultures of 1.8 mole % C10:0 and 3.1 mole % C8:0. The largest percentage component of the control culture was C16:0 at 45.2 mole %. In comparison, the KA11 culture contained only approximately 8.4 mole % C16:0. Similar analyses on a later generation of KA11 cells which exhibited a normal growth rate phenotype, revealed lower percentages of C10:0, 25.9 mole %, and C8:0, 18.9 mole %, fatty acids. In this later study, the control E. coli culture contained approximately 5 mole % each of C10:0 and C8:0.

B. Expression of Cuphea hookeriana thioesterases.

1. The CUPH-2 type C. hookeriana cDNA clone shown in FIG. 7 (CMT7) is expressed as a lacZ fusion in E. coli. CMT7 is digested with StuI and partially digested with XhoI, and the approximately 1100 base pair fragment containing the majority of the thioesterase encoding region is cloned into SmaI/SalI digested pUC118, resulting in construct KA17. The StuI site in CMT7 is located at nucleotides 380-385 of the sequence shown in FIG. 7, and the XhoI site is located following the 3' end of the cDNA clone in the vector cloning region. As discussed above, the N-terminus for the mature CUPH-2 thioesterase is believed to be either the aspartate residue encoded by nucleotides 365-367 or the leucine residue encoded by nucleotides 293-295. For expression analysis, KA17 is used to transform E. coli fadD⁺ cells (commercially available cells such as SURE cells from BRL may be used) or an E. coli mutant, fadD, which lacks medium-chain specific acyl-CoA synthetase (Overath et al., Eur. J. Biochem (1969) 7:559-574).

Unlike the results with bay and elm, E. coli fadD⁺ cells transformed with KA17 exhibit no unusual growth or morphology phenotype. However, in fadD mutants, the plasmid is not maintained at 37° C. At 30° C., the transformed cells grow slightly slower and form smaller colonies on media plates although the plasmid is stably maintained.

GC analysis is conducted on cultures of both fadD⁺ and fadD mutant strains expressing KA17 thioesterase. An increase in C8:0 and to a lesser extent C10:0 fatty acid accumulation is observed in both fadD⁺ and fadD mutant strains. In one experiment, levels of C8:0 and C10:0 fatty acyl groups in fadD⁺ cells following a 2 hour induction were 23.5 and 8.1 mole % respectively. Levels of C8:0 and C10:0 fatty acyl groups after 2 hour induction in control cells were 3.9 and 3.0 mole % respectively. In a fadD mutant strain, fatty acids were measured following overnight induction. In cells transformed with KA17 , C8:0 and C10:0 levels were 51.5 and 14.3 mole % respectively. In control cells C8:0 and C10:0 levels were 2.3 and 2.5 mole % respectively.

2. A construct for expression of a Cuphea hookeriana CUPH-1 type thioesterase in E. coli is also prepared. The construct encodes a lacZ fusion of the Cuphea mature protein sequence shown in FIG. 5. The fusion protein is expressed in both wild-type (K12) and fadD strains of E. coli . Both strains of E. coli deposit large amount of crystals when transformed with the Cuphea expression construct. In addition, both transformed strains exhibit growth retardation, which is slight in the K-12 cells and severe in the fadD mutants. The slow growth phenotype is believed due to a toxic effect of C8 and C10 fatty acids on the E. coli cells. Fatty acid analysis (acid methanolysis) of K12 and fadD transformants does not indicate accumulation of a particular fatty acid. It is believed that the crystals observed in these cells may represent an altered form of a medium chain fatty acid that is not detectable by the acid methanolysis methods utilized. Studies of the ability of the cell extracts to hydrolyze acyl-ACP substrates indicates increased acyl-ACP activity towards medium chain fatty acyl-ACP C8 , C10 and C12 substrates in transformed fadD cells. Results of these analyses are shown in Table 2.

                  TABLE 2                                                          ______________________________________                                         Lysate        Substrate                                                                               Hydrolysis Activity                                     ______________________________________                                         Cuphea clone   8:0-ACP 830                                                     "             10:0-ACP 1444                                                    "             12:0-ACP 1540                                                    "             14:0-ACP 1209                                                    "             18:1-ACP 1015                                                    control        8:0-ACP 4                                                       "             10:0-ACP 52                                                      "             12:0-ACP 63                                                      "             14:0-ACP 145                                                     "             18:1-ACP 128                                                     ______________________________________                                    

Normalization of the assay results to the C18:1 levels reveals a significant increase in the C8:0, C10:0 and C12:0-ACP thioesterase activities.

Further analyses of fast growing variants expressing the CUPH-1 thioesterase were conducted. Isolation and analysis of the crystals produced by the CUPH-1 expressing E. coli cells indicates that these crystals are comprised of predominantly C16 and C14 fatty acids. In addition, further analyses revealed an increase in hydrolysis activity towards C16 fatty acids in these cells. It is not clear if the C16 activity and fatty acid production are a direct result of the CUPH-1 thioesterase, or if this effect is derived from the E. coli cells.

C. Expression of Myristoyl ACP Thioesterase in E. coli

A Vibrio harvei myristoyl ACP thioesterase encoding sequence (Miyamoto et al., J. Biol. Chem. (1988) 262:13393-13399) lacking the initial ATG codon is prepared by PCR. The gene is expressed in E. coli as a lacZ fusion and E. coli extracts are assayed to confirm myristoyl ACP thioesterase activity. The C14 thioesterase construct is used to transform an E. coli fadD strain. The cells transformed in this manner deposit large quantities of crystals which are identified as potassium myristate by mass spectrometry. Fatty acid analysis of the E. coli extracts reveals that greater than 50% (on a mole basis) of the fatty acids are C14:0, as compared to control E. coli fadD cells which contain approximately 11.5 mole percent C14:0.

Example 4

Constructs for Plant Transformation

Constructs for expression of Cuphea and elm thioesterases in plant cells which utilize a napin expression cassette are prepared as follows.

A. Napin Expression Cassette

A napin expression cassette, pCGN1808, is described in copending U.S. patent application Ser. No. 07/742,834 which is incorporated herein by reference. pCGN1808 is modified to contain flanking restriction sites to allow movement of only the expression sequences and not the antibiotic resistance marker to binary vectors. Synthetic oligonucleotides containing KpnI, NotI and HindIII restriction sites are annealed and ligated at the unique HindIII site of pCGN1808, such that only one HindIII site is recovered. The resulting plasmid, pCGN3200 contains unique HindIII, NotI and KpnI restriction sites at the 3'-end of the napin 3'-regulatory sequences as confirmed by sequence analysis.

The majority of the napin expression cassette is subcloned from pCGN3200 by digestion with HindIII and SacI and ligation to HindIII and SacI digested pIC19R (Marsh, et al. (1984) Gene 32:481-485) to make pCGN3212. The extreme 5'-sequences of the napin promoter region are reconstructed by PCR using pCGN3200 as a template and two primers flanking the SacI site and the junction of the napin 5'-promoter and the pUC backbone of pCGN3200 from the pCGN1808 construct. The forward primer contains ClaI, HindIII, NotI, and KpnI restriction sites as well as nucleotides 408-423 of the napin 5'-sequence (from the EcoRV site) and the reverse primer contains the complement to napin sequences 718-739 which include the unique SacI site in the 5'-promoter. The PCR was performed using in a Perkin Elmer/Cetus thermocycler according to manufacturer's specifications. The PCR fragment is subcloned as a blunt-ended fragment into pUC8 (Vieira and Messing (1982) Gene 19:259-268) digested with HincII to give pCGN3217. Sequenced of pCGN3217 across the napin insert verifies that no improper nucleotides were introduced by PCR. The napin 5-sequences in pCGN3217 are ligated to the remainder of the napin expression cassette by digestion with ClaI and SacI and ligation to pCGN3212 digested with ClaI and SacI. The resulting expression cassette pCGN3221, is digested with HindIII and the napin expression sequences are gel purified away and ligated to pIC20H (Marsh, supra) digested with HindIII. The final expression cassette is pCGN3223, which contains in an ampicillin resistant background, essentially identical 1.725 napin 5' and 1.265 3' regulatory sequences as found in pCGN1808. The regulatory regions are flanked with HindIII, NotI and KpnI restriction sites and unique SalI, BglII, PstI, and XhoI cloning sites are located between the 5' and 3' noncoding regions.

B. Cuphea Acyl-ACP Thioesterase Expression Construct

PCR analysis of Cuphea hookeriana reverse transcribed cDNA indicated that the 5' region of the TAA 342 CUPH-1 clone was lacking a guanine nucleotide (G) following nucleotide 144 of the sequence shown in FIG. 5. (DNA sequence analysis of the CMT9 CUPH-1 clone confirms the presence of the G nucleotide in that region.) Thus, a G nucleotide was inserted after nucleotide 144 in TAA 342 by PCR directed mutagenesis resulting in an encoding region beginning at the ATG at 143-145 of the sequence shown in FIG. 5. The corrected encoding sequence was cloned into a convenient vector using SalI and XhoI sites (also inserted in the PCR reaction), resulting in KA2. A SalI fragment of the resulting clone, comprising nucleotides 137-1464 of the sequence shown in FIG. 5 (plus the inserted G nucleotide discussed above), was cloned into napin expression cassette pCGN3223. The napin/Cuphea thioesterase/napin construct was then excised as a HindIII fragment and cloned into the binary vector pCGN1557 (McBride and Summerfelt (1990) Plant Mol. Biol. 14:269-276). The resulting construct, pCGN4800, was transformed into Agrobacterium tumefaciens and used to prepare transformed plants.

Similarly, the Cuphea CUPH-2 clone, CMT-7 is inserted into a napin expression cassette and the resulting napin 5'/CUPH-2/napin 3' construct transferred to a binary vector for plant transformation.

C. Elm Acyl-ACP Thioesterase Expression Construct

A construct for expression of an elm C10 and C8 acyl-ACP thioesterase in plant seed cells using a napin expression cassette is prepared as follows. As discussed above, the elm ULM-1 medium-chain acyl-ACP thioesterase cDNA does not appear to encode the entire thioesterase transit peptide. Thus, the elm thioesterase coding region was fused to the transit peptide encoding region from the Cuphea CUPH-1 clone as follows. pCGN4800 (CUPH-1 in napin cassette) was digested with XbaI, blunted and digested with StuI to remove the mature protein coding portion of the CUPH-1 construct. The StuI site is located at nucleotides 496-501 of the CUPH-1 sequence shown in FIG. 5. The XbaI site is located between the end of the Cuphea thioesterase cDNA sequence and the napin 3' regulatory region. The ULM-1 mature protein encoding region is inserted into the napin/Cuphea transit peptide backbone resulting from removal of the Cuphea mature protein endoding region as follows. The ULM-1 clone is digested with XbaI, blunted and digested with StuI to obtain the elm thioesterase mature protein encoding region. The StuI site is located at nucleotides 250-255 of the sequence shown in FIG. 2, and the XbaI site is located at nucleotides 1251-1256, 3' to the stop codon. Ligation of the elm StuI/XbaI fragment into the napin/Cuphea transit peptide backbone results in pCGN4802, having the napin 5'/Cuphea transit:elm mature/napin 3' expression construct. pCGN4803 is transferred to pCGN1557 as a HindIII fragment resulting in pCGN4803, a binary construct for plant transformation.

Example 5

Plant Transformation

A variety of methods have been developed to insert a DNA sequence of interest into the genome of a plant host to obtain the transcription or transcription and translation of the sequence to effect phenotypic changes.

A. Brassica Transformation

Seeds of Brassica napus cv. Westar are soaked in 95% ethanol for 2 min. surface sterilized in a 1.0% solution of sodium hypochlorite containing a drop of Tween 20 for 45 min., and rinsed three times in sterile, distilled water. Seeds are then plated in Magenta boxes with 1/10th concentration of Murashige minimal organics medium (Gibco; Grand Island, N.Y.) supplemented with pyriodoxine (50 μg/l), nicotinic acid (50 μg/l), glycine (200 μg/l), and 0.6% Phytagar (Gibco) pH 5.8. Seeds are germinated in a Percival chamber at 22° C. in a 16 h photoperiod with cool fluorescent and red light of intensity approximately 65μ Einsteins per square meter per second (μEm⁻² S⁻¹).

Hypocotyls are excised from 5-7 day old seedlings, cut into pieces approximately 4 mm in length, and plated on feeder plates (Horsch et al., Science (1985) 227:1229-1231). Feeder plates are prepared one day before use by plating 1.0 ml of a tobacco suspension culture onto a petri plate (100×25 mm) containing about 30 ml MS salt base (Carolina Biological, Burlington, N.C.) 100 mg/l inositol, 1.3 mg/l thiamine-HCl, 200 mg KH₂ PO₄ with 3% sucrose, 2,4-D (1.0 mg/l), 0.6% w/v Phytagar, and pH adjusted to 5.8 prior to autoclaving (MS 0/1/0 medium). A sterile filter paper disc (Whatman 3 mm) is placed on top of the feeder layer prior to use. Tobacco suspension cultures are subcultured weekly by transfer of 10 ml of culture into 100 ml fresh MS medium as described for the feeder plates with 2,4-D (0.2 mg/l), Kinetin (0.1 mg/l). In experiments where feeder cells are not used hypocotyl explants are cut and placed onto a filter paper disc on top of MS0/1/0 medium. All hypocotyl explants are preincubated on feeder plates for 24 h. at 22° C. in continuous light of intensity 30 μEm⁻² S⁻¹ to

Single colonies of A. tumefaciens strain EHA 101 containing a binary plasmid are transferred to 5 ml MG/L broth and grown overnight at 30° C. Hypocotyl explants are immersed in 7-12 ml MG/L broth with bacteria diluted to 1×10⁸ bacteria/ml and after 10-25 min. are placed onto feeder plates. Per liter MG/L broth contains 5 g mannitol, 1 g L-Glutamic acid or 1.15 g sodium glutamate, 0.25 g kH₂ PO₄, 0.10 g NaCl, 0.10 g MGSO₄.7H₂ O, 1 mg biotin, 5 g tryptone, and 2.5 g yeast extract, and the broth is adjusted to pH 7.0. After 48 hours of co-incubation with Agrobacterium, the hypocotyl explants are transferred to B5 0/1/0 callus induction medium which contains filter sterilized carbenicillin (500 mg/l, added after autoclaving) and kanamycin sulfate (Boehringer Mannheim; Indianapolis, Ind.) at concentrations of 25 mg/l.

After 3-7 days in culture at 65 μEM⁻² S⁻¹ continuous light, callus tissue is visible on the cut surface and the hypocotyl explants are transferred to shoot induction medium, B5BZ (B5 salts and vitamins supplemented with 3 mg/l benzylaminopurine, 1 mg/l zeatin, 1% sucrose, 0.6% Phytagar and pH adjusted to 5.8). This medium also contains carbenicillin (500 mg/l) and kanamycin sulfate (25 mg/l). Hypocotyl explants are subcultured onto fresh shoot induction medium every two weeks.

Shoots regenerate from the hypocotyl calli after one to three months. Green shoots at least 1 cm tall are excised from the calli and placed on medium containing B5 salts and vitamins, 1% sucrose, carbenicillin (300 mg/l), kanamycin sulfate (50 mg/l) and 0.6% w/v Phytagar). After 2-4 weeks shoots which remain green are cut at the base and transferred to Magenta boxes containing root induction medium (B5 salts and vitamins, 1% sucrose, 2 mg/l indolebutyric acid, 50 mg/l kanamycin sulfate and 0.6% Phytagar). Green rooted shoots are tested for thioesterase activity.

B. Arabidposis Transformation

Transgenic Arabidopsis thaliana plants may be obtained by Agrobacterium-mediated transformation as described by Valverkens et al., (Proc. Nat. Acad. Sci. (1988) 85:5536-5540). Constructs are transformed into Agrobacterium cells, such as of strain EHA101 (Hood et al., J. Bacteriol (1986) 168:1291-1301), by the method of Holsters et al. (Mol. Gen. Genet. (1978) 163:181-187).

C. Peanut Transformation

DNA sequences of interest may be introduced as expression cassettes, comprising at least a promoter region, a gene of interest, and a termination region, into a plant genome via particle bombardment as described in European Patent Application 332 855 and in co-pending application U.S. Ser. No. 07/225,332, filed Jul. 27, 1988.

Briefly, tungsten or gold particles of a size ranging from 0.5 μM-3 μM are coated with DNA of an expression cassette. This DNA may be in the form of an aqueous mixture or a dry DNA/particle precipitate.

Tissue used as the target for bombardment may be from cotyledonary explants, shoot meristems, immature leaflets, or anthers.

The bombardment of the tissue with the DNA-coated particles is carried out using a Biolistics™ particle gun (Dupont; Wilmington, Del.). The particles are placed in the barrel at variable distances ranging from 1 cm-14 cm from the barrel mouth. The tissue to be bombarded is placed beneath the stopping plate; testing is performed on the tissue at distances up to 20 cm. At the moment of discharge, the tissue is protected by a nylon net or a combination of nylon nets with mesh ranging from 10 μM to 300 μM.

Following bombardment, plants may be regenerated following the method of Atreya, et al., (Plant Science Letters (1984) 34:379-383). Briefly, embryo axis tissue or cotyledon segments are placed on MS medium (Murashige and Skoog, Physio. Plant. (1962) 15:473) (MS plus 2.0 mg/l 6-benzyladenine (BA) for the cotyledon segments) and incubated in the dark for 1 week at 25°±2° C. and are subsequently transferred to continuous cool white fluorescent light (6.8 W/m²). On the 10th day of culture, the plantlets are transferred to pots containing sterile soil, are kept in the shade for 3-5 days are and finally moved to greenhouse.

The putative transgenic shoots are rooted. Integration of exogenous DNA into the plant genome may be confirmed by various methods know to those skilled in the art.

Example 7

Transformation with Antisense Plant Thioesterase

Constructs for expression of antisense Brassica thioesterase in plant cells are prepared as follows. An approximately 1.1 kb fragment of the full length Brassica long chain thioesterase is obtained by PCR amplification of the pCGN3266 insert. The forward primer binds to the antisense strand and primes synthesis of the sense thioesterase sequence. This primer contains nucleotides 27-42 of the pCGN3266 sequence shown in FIG. 6A, and also has an XhoI restriction site at the 5' end. The reverse primer binds to the sense strand and primes synthesis of antisense thioesterase DNA. It contains the reverse complement to nucleotides 1174-1191 of the pCGN3266 sequence shown in FIG. 6A, and also has a SalI restriction site at the 5' end.

PCR reactions are run using Taq polymerase in a DNA thermocycler (Perkin Elmer/Cetus) according to manufacturer's specifications. Cycle parameters may be altered to provide a maximum yield of the thioesterase PCR product. The 1.1 kb PCR product is verified by restriction mapping and agarose gel electrophoresis. The PCR product is digested with XhoI and SalI restriction enzymes and cloned into the napin expression casette pCGN3233 which has been digested with XhoI and SalI.

The napin/antisense thioesterase/napin plasmid generated by these manipulations is digested to obtain the napin/antisense thioesterase/napin fragment, which is inserted into binary vectors for plant transformation. For re-transformation of transgenic laurate-producing plants having a kanamycin resistance marker, the fragment is inserted into a hygromycin binary vector as follows. The fragment, containing ˜1.7 kb of napin 5' noncoding sequence, an ˜1.1 kb SalI/XhoI antisense thioesterase cDNA fragment and ˜1.5 kb of 3' napin non-coding region, is engineered to contain KpnI recognition sequences at the ends. The fragment is then digested with KpnI and ligated to KpnI digested pCGN2769 (hygromycin binary vector discussed above) for plant transformation.

For transformation of non-transgenic Brassica, the napin/antisense BTE/napin fragment may be obtained by digestion with KpnI and partial digestion with BamHI to generate an ˜3.3 kb fragment containing ˜1.7 kb of napin 5' noncoding sequence, the ˜1.1 kb SalI/XhoI antisense thioesterase cDNA fragment and ˜0.33 kb of the 3' napin noncoding region, the rest of the napin 3' region having been deleted due to the BamHI site in this region. The ˜3.3 kb KpnI/BamHI fragment may be ligated to KpnI/BamHI digested pCGN1578 to provide a plant transformation vector.

In addition to the above Brassica antisense thioesterase construct, other constructs having various portions of the Brassica thioesterase encoding sequence may be desirable. As there are regions of homology between the bay and Brassica thioesterase sequences, the possibility of decreasing the bay thioesterase expression with the antisense Brassica sequence may be avoided by using fragments of the Brassica gene which are not substantially homologous to the bay gene. For example, the sequences at the 5' and 3' ends of the Brassica clone are not significantly homologous to the bay sequence and are therefore desirable for antisense Brassica thioesterase purposes.

Example 7

Expression of Non-Plant ACYL-ACP Thioesterases In Plants

Constructs for expression of the Vibrio harvei myristoyl ACP thioesterase in plant cells which utilize napin promoter regions are prepared as follows. Two 100 base oligos are synthesized:

HARV-S: 5' CGG TCT AGA T AA CAA TCA ATG CAA GAC TAT TGC ACA CGT GTT GCG TGT GAA CAA TGG TCA GGA GCT TCA CGT CTG GGA AAC GCC CCC AAA AGA AAA CGT G (SEQ ID NO:13) 3'

HARV-A: 5' ATA CTC GGC CAA TCC AGC GAA GTG GTC CAT TCT TCT GGC GAA ACC AGA AGC AAT CAA AAT GGT GTT GTT TTT AAA AGG CAC GTT TTC TTT TGG GGG CGT T (SEQ ID NO:14) 3'

The two oligos contain a region of complementary sequence for annealing (underlined region). A TAQ polymerase extension reaction utilizing the two oligos yields a 180 bp product. The oligos consisted essentially of luxD sequence with sequence changes introduced to remove the 3 potential poly(A) addition sites and to alter 5 bases to change the codon preference from bacteria to plants. All changes were conservative; i.e. the amino acid sequence was not altered.

The 180 bp TAQ polymerase extension product is blunted and cloned into Bluescript. The approximately 180 bp luxD fragment is then removed from Bluescript by digestion with XbaI and EaeI and cloned in frame with the EaeI/XbaI fragment from the Vibrio cDNA clone, containing the remainder of the luxD gene, by 3-way ligation into XbaI/XhoI digested Bluescript SK. The luxD gene is removed by digestion with XbaI and partial digestion with PstI and cloned in frame with the safflower thioesterase transit peptide encoding region into a napin expression casette. The napin 5'/safflower transit:myristoyl ACP thioesterase/napin 3' fragment is cloned into KpnI/BamHI digested pCGN1557 (McBride and Summerfelt, supra) resulting in pCGN3845, a binary expression vector for plant transformation.

The resulting transgenic plants are grown to seed and analyzed to determine the percentage of C14 fatty acids produced as the result of insertion of the bacterial acyl transferase gene. Analysis of pooled seed samples from 24 segregating transgenic (T1) Brassica napus plants indicates C14 fatty acid levels ranging from 0.12 to 1.13 mole %. Two plants, 3845-1 and 3845-18, contain greater than 1 mole % C14:0 fatty acids in their seed oils. Similar analysis of non-transgenic B. napus seeds reveals C14:0 levels of approximately 0.1 mole %. Analysis of single seeds from 3845-18 reveals individual seeds having greater than 2 mole % C14:0 in the oil. Western analysis is conducted to determine amounts of the C14:0 thioesterase present in transgenic plants. A comparison of protein amount to mole % C14:0 (myristate) produced indicates that myristate levels increase with increasing amounts of the thioesterase protein.

All publications and patent applications mentioned in this specification are indicative of the level of skill of those skilled in the art to which this invention pertains. All publications and patent applications are herein incorporated by reference to the same extent as if each individual publication or patent application was specifically and individually indicated to be incorporated by reference.

Although the foregoing invention has been described in some detail by way of illustration and example for purposes of clarity of understanding, it will be obvious that certain changes and modifications may be practiced within the scope of the appended claim.

    __________________________________________________________________________     SEQUENCE LISTING                                                               (1) GENERAL INFORMATION:                                                       (iii) NUMBER OF SEQUENCES: 14                                                  (2) INFORMATION FOR SEQ ID NO:1:                                               (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 1561 base pairs                                                    (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: cDNA to mRNA                                               (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1:                                        AGAGAGAGAGAGAGAGAGAGAGCTAAATTAAAAAAAAAACCCAGAAGTGGGAAATCTTCC60                 CCATGAAATAACGGATCCTCTTGCTACTGCTACTACTACTACTACAAACTGTAGCCATTT120                ATATAATTCTATATAATTTTCAACATGGCCACCACCTCTTTAGCTTCCGCTTTC174                      MetAlaThrThrSerLeuAlaSerAlaPhe                                                 1510                                                                           TGCTCGATGAAAGCTGTAATGTTGGCTCGTGATGGCCGGGGCATGAAA222                            CysSerMetLysAlaValMetLeuAlaArgAspGlyArgGlyMetLys                               152025                                                                         CCCAGGAGCAGTGATTTGCAGCTGAGGGCGGGAAATGCGCCAACCTCT270                            ProArgSerSerAspLeuGlnLeuArgAlaGlyAsnAlaProThrSer                               303540                                                                         TTGAAGATGATCAATGGGACCAAGTTCAGTTACACGGAGAGCTTGAAA318                            LeuLysMetIleAsnGlyThrLysPheSerTyrThrGluSerLeuLys                               455055                                                                         AGGTTGCCTGACTGGAGCATGCTCTTTGCAGTGATCACAACCATCTTT366                            ArgLeuProAspTrpSerMetLeuPheAlaValIleThrThrIlePhe                               606570                                                                         TCGGCTGCTGAGAAGCAGTGGACCAATCTAGAGTGGAAGCCGAAGCCG414                            SerAlaAlaGluLysGlnTrpThrAsnLeuGluTrpLysProLysPro                               75808590                                                                       AAGCTACCCCAGTTGCTTGATGACCATTTTGGACTGCATGGGTTAGTT462                            LysLeuProGlnLeuLeuAspAspHisPheGlyLeuHisGlyLeuVal                               95100105                                                                       TTCAGGCGCACCTTTGCCATCAGATCTTATGAGGTGGGACCTGACCGC510                            PheArgArgThrPheAlaIleArgSerTyrGluValGlyProAspArg                               110115120                                                                      TCCACATCTATACTGGCTGTTATGAATCACATGCAGGAGGCTACACTT558                            SerThrSerIleLeuAlaValMetAsnHisMetGlnGluAlaThrLeu                               125130135                                                                      AATCATGCGAAGAGTGTGGGAATTCTAGGAGATGGATTCGGGACGACG606                            AsnHisAlaLysSerValGlyIleLeuGlyAspGlyPheGlyThrThr                               140145150                                                                      CTAGAGATGAGTAAGAGAGATCTGATGTGGGTTGTGAGACGCACGCAT654                            LeuGluMetSerLysArgAspLeuMetTrpValValArgArgThrHis                               155160165170                                                                   GTTGCTGTGGAACGGTACCCTACTTGGGGTGATACTGTAGAAGTAGAG702                            ValAlaValGluArgTyrProThrTrpGlyAspThrValGluValGlu                               175180185                                                                      TGCTGGATTGGTGCATCTGGAAATAATGGCATGCGACGTGATTTCCTT750                            CysTrpIleGlyAlaSerGlyAsnAsnGlyMetArgArgAspPheLeu                               190195200                                                                      GTCCGGGACTGCAAAACAGGCGAAATTCTTACAAGATGTACCAGCCTT798                            ValArgAspCysLysThrGlyGluIleLeuThrArgCysThrSerLeu                               205210215                                                                      TCGGTGCTGATGAATACAAGGACAAGGAGGTTGTCCACAATCCCTGAC846                            SerValLeuMetAsnThrArgThrArgArgLeuSerThrIleProAsp                               220225230                                                                      GAAGTTAGAGGGGAGATAGGGCCTGCATTCATTGATAATGTGGCTGTC894                            GluValArgGlyGluIleGlyProAlaPheIleAspAsnValAlaVal                               235240245250                                                                   AAGGACGATGAAATTAAGAAACTACAGAAGCTCAATGACAGCACTGCA942                            LysAspAspGluIleLysLysLeuGlnLysLeuAsnAspSerThrAla                               255260265                                                                      GATTACATCCAAGGAGGTTTGACTCCTCGATGGAATGATTTGGATGTC990                            AspTyrIleGlnGlyGlyLeuThrProArgTrpAsnAspLeuAspVal                               270275280                                                                      AATCAGCATGTGAACAACCTCAAATACGTTGCCTGGGTTTTTGAGACC1038                           AsnGlnHisValAsnAsnLeuLysTyrValAlaTrpValPheGluThr                               285290295                                                                      GTCCCAGACTCCATCTTTGAGAGTCATCATATTTCCAGCTTCACTCTT1086                           ValProAspSerIlePheGluSerHisHisIleSerSerPheThrLeu                               300305310                                                                      GAATACAGGAGAGAGTGCACGAGGGATAGCGTGCTGCGGTCCCTGACC1134                           GluTyrArgArgGluCysThrArgAspSerValLeuArgSerLeuThr                               315320325330                                                                   ACTGTCTCTGGTGGCTCGTCGGAGGCTGGGTTAGTGTGCGATCACTTG1182                           ThrValSerGlyGlySerSerGluAlaGlyLeuValCysAspHisLeu                               335340345                                                                      CTCCAGCTTGAAGGTGGGTCTGAGGTATTGAGGGCAAGAACAGAGTGG1230                           LeuGlnLeuGluGlyGlySerGluValLeuArgAlaArgThrGluTrp                               350355360                                                                      AGGCCTAAGCTTACCGATAGTTTCAGAGGGATTAGTGTGATACCCGCA1278                           ArgProLysLeuThrAspSerPheArgGlyIleSerValIleProAla                               365370375                                                                      GAACCGAGGGTGTAACTAATGAAAGAAGCATCTGTTGAAGTTTCTCCCATGC1330                       GluProArgVal                                                                   380                                                                            TGTTCGTGAGGATACTTTTTAGAAGCTGCAGTTTGCATTGCTTGTGCAGAATCATGGTCT1390               GTGGTTTTAGATGTATATAAAAAATAGTCCTGTAGTCATGAAACTTAATATCAGAAAAAT1450               AACTCAATGGGTCAAGGTTATCGAAGTAGTCATTTAAGCTTTGAAATATGTTTTGTATTC1510               CTCGGCTTAATCTGTAAGCTCTTTCTCTTGCAATAAAGTTCGCCTTTCAAT1561                        (2) INFORMATION FOR SEQ ID NO:2:                                               (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 1433 base pairs                                                    (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: cDNA to mRNA                                               (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2:                                        GAATTCGGCACGAGGGGCTCCGGTGCTTTGCAGGTGAAGGCAAGTTCC48                             GluPheGlyThrArgGlySerGlyAlaLeuGlnValLysAlaSerSer                               51015                                                                          CAAGCTCCACCAAAGCTCAATGGTTCCAATGTGGGTTTGGTTAAATCT96                             GlnAlaProProLysLeuAsnGlySerAsnValGlyLeuValLysSer                               202530                                                                         AGCCAAATTGTGAAGAAGGGTGATGACACCACATCTCCTCCTGCAAGA144                            SerGlnIleValLysLysGlyAspAspThrThrSerProProAlaArg                               354045                                                                         ACTTTCATCAACCAATTGCCTGATTGGAGCATGCTTCTTGCTGCTATC192                            ThrPheIleAsnGlnLeuProAspTrpSerMetLeuLeuAlaAlaIle                               505560                                                                         ACAACCCTGTTCTTGGCTGCAGAGAAGCAGTGGATGATGCTTGATTGG240                            ThrThrLeuPheLeuAlaAlaGluLysGlnTrpMetMetLeuAspTrp                               65707580                                                                       AAACCCAAAAGGCCTGACATGCTTGTTGATCCATTTGGTCTTGGAAGG288                            LysProLysArgProAspMetLeuValAspProPheGlyLeuGlyArg                               859095                                                                         TTTGTTCAGGATGGTCTTGTTTTCCGCAACAACTTTTCAATTCGATCA336                            PheValGlnAspGlyLeuValPheArgAsnAsnPheSerIleArgSer                               100105110                                                                      TATGAAATAGGGGCTGATCGAACGGCTTCTATAGAAACGTTAATGAAT384                            TyrGluIleGlyAlaAspArgThrAlaSerIleGluThrLeuMetAsn                               115120125                                                                      CATCTGCAGGAAACAGCTCTTAATCATGTGAAGTCTGTTGGGCTTCTT432                            HisLeuGlnGluThrAlaLeuAsnHisValLysSerValGlyLeuLeu                               130135140                                                                      GAGGATGGCCTAGGTTCGACTCGAGAGATGTCCTTGAGGAACCTGATA480                            GluAspGlyLeuGlySerThrArgGluMetSerLeuArgAsnLeuIle                               145150155160                                                                   TGGGTTGTCACTAAAATGCAGGTTGCGGTTGATCGCTATCCAACTTGG528                            TrpValValThrLysMetGlnValAlaValAspArgTyrProThrTrp                               165170175                                                                      GGAGATGAAGTTCAGGTATCCTCTTGGGCTACTGCAATTGGAAAGAAT576                            GlyAspGluValGlnValSerSerTrpAlaThrAlaIleGlyLysAsn                               180185190                                                                      GGAATGCGTCGCGAATGGATAGTCACTGATTTTAGAACTGGTGAAACT624                            GlyMetArgArgGluTrpIleValThrAspPheArgThrGlyGluThr                               195200205                                                                      CTATTAAGAGCCACCAGTGTTTGGGTGATGATGAATAAACTGACGAGG672                            LeuLeuArgAlaThrSerValTrpValMetMetAsnLysLeuThrArg                               210215220                                                                      AGGATATCCAAAATCCCAGAAGAGGTTTGGCACGAAATAGGCCCCTCT720                            ArgIleSerLysIleProGluGluValTrpHisGluIleGlyProSer                               225230235240                                                                   TTCATTGATGCTCCTCCTCTTCCCACCGTGGAAGATGATGGTAGAAAG768                            PheIleAspAlaProProLeuProThrValGluAspAspGlyArgLys                               245250255                                                                      CTGACAAGGTTTGATGAAAGTTCTGCAGACTTTATCCGCNCTGGTTTA816                            LeuThrArgPheAspGluSerSerAlaAspPheIleArgXxxGlyLeu                               260265270                                                                      ACTCCTAGGTGGAGTGATTTGGACATCAACCAGCATGTCAACAATGTG864                            ThrProArgTrpSerAspLeuAspIleAsnGlnHisValAsnAsnVal                               275280285                                                                      AAGTACATTGGCTGGCTCCTTGAGAGTGCTCCGCCGGAGATCCACGAG912                            LysTyrIleGlyTrpLeuLeuGluSerAlaProProGluIleHisGlu                               290295300                                                                      AGTCACGAGATAGCGTCTCTGACTCTGGAGTACAGGAGGGAGTGTGGA960                            SerHisGluIleAlaSerLeuThrLeuGluTyrArgArgGluCysGly                               305310315320                                                                   AGGGACAGCGTGCTGAACTCCGCGACCAAGGTCTCTGACTCCTCTCAA1008                           ArgAspSerValLeuAsnSerAlaThrLysValSerAspSerSerGln                               325330335                                                                      CTGGGAAAGTCTGCTGTGGAGTGTAACCACTTGGTTCGTCTCCAGAAT1056                           LeuGlyLysSerAlaValGluCysAsnHisLeuValArgLeuGlnAsn                               340345350                                                                      GGTGGGGAGATTGTGAAGGGAAGGACTGTGTGGAGGCCCAAACGTCCT1104                           GlyGlyGluIleValLysGlyArgThrValTrpArgProLysArgPro                               355360365                                                                      CTTTACAATGATGGTGCTGTTGTGGACGTGNAAGCTAAAACCTCT1149                              LeuTyrAsnAspGlyAlaValValAspValXxxAlaLysThrSer                                  370375380                                                                      TAAGTCTTATAGTCCAAGTGAGGAGGAGTTCTATGTATCAGGAAGTTGCTAGGATTCTCA1209               ATCGCATGTGTCCATTTCTTGTGTGGAATACTGCTCGTGTTTCTAGACTCGCTATATGTT1269               TGTTCTTTTATATATATATATATATATATATCTCTCTCTTCCCCCCACCTCTCTCTCTCT1329               CTCTATATATATATATGTTTTATGTAAGTTTTCCCCTTAGTTTCCTTTCCTAAGTAATGC1389               CATTGTAAATTACTTCAAAAAAAAAAAAAAAAAAAAAACTCGAG1433                               (2) INFORMATION FOR SEQ ID NO:3:                                               (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH:126 base pairs                                                      (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE:other nucleic acid                                          (A) DESCRIPTION: PCR to mRNA                                                   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:3:                                        TGGATCCAATCAACATGTCAACAATGTGAAATACATTGGGTGGATTCTC49                            AsnGlnHisValAsnAsnValLysTyrIleGlyTrpIleLeu                                     1510                                                                           AAGAGTGTTCCAACAAAAGTTTTCGAGACCCAGGAGTTATGTGGCGTC97                             LysSerValProThrLysValPheGluThrGlnGluLeuCysGlyVal                               15202530                                                                       ACCCTCGAGTACCGGCGGGAATGCTCGAG126                                               ThrLeuGluTyrArgArgGluCys                                                       35                                                                             (2) INFORMATION FOR SEQ ID NO:4:                                               (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 114 base pairs                                                     (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: other nucleic acid                                         (A) DESCRIPTION: PCR to mRNA                                                   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4:                                        AATCAACATGTCAACAATGTGAAATACATTGGGTGGATTCTCAAGAGTGTTCCAACAAAA60                 GTTTTCGAGACCCAGGAGTTATGTGGCGTCACCCTCGAGTACCGGCGGGAATGC114                      (2) INFORMATION FOR SEQ ID NO:5:                                               (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 99 base pairs                                                      (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: other nucleic acid                                         (A) DESCRIPTION: PCR to mRNA                                                   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:5 :                                       AATCAGCATGTGAATAACGTGAAATACATTGGGTGGATTCTCAAGAGTGTTCCAACAGAT60                 GTTTTTGAGGCCCAGGAGCTATGTGGAGTCACCCTCGAG99                                      (2) INFORMATION FOR SEQ ID NO:6:                                               (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 1601 base pairs                                                    (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: cDNA to mRNA                                               (xi) SEQUENCE DESCRIPTION: SEQ ID NO:6:                                        ACGCGGTGGCGGCCGCTCTAGAACTAGTGGATCCCCCGGGCTGCAGGAATTCGGCACGAG60                 CTTTCTCCCCCACAACCTCTTTCCCGCATTTGTTGAGCTGTTTTTTGTCGCCATTCGCCC120                TCTCCTCTTCAGTTCAACGAAAATGGTGGCTACCCTGCAAGTTCTGCATTCTTCCCCCTG180                CCATCCGCCGACACCTCCTCTTCGAGACCCGGAAAGCTCGGCAATGGGCCATCGAGCTTC240                AGCCCCCTCAAGCCCAAATCGACCCCCAATGGCGGTTTGCAGGTTAAGGCAAACGCCAGC300                GCCCCTCCTAAGATCAATGGTTCACCGGTCGGTCTAAAGTCGGGCGGTCTCAAGACTCAG360                GAAGACGCTCCTTCGGCCCCTCCTCCGCGGACTTTTATCAACCAGTTGCCTGATTGGAGT420                ATGCTTCTTGCTGCAATCACTACTGTCTTCTTGGCTGCAGAGAAGCAGTGGATGATGCTT480                Leu                                                                            GATTGGAAACCTAAGAGGCCTGACATGCTTGTGGACCCGTTCGGATTG528                            AspTrpLysProLysArgProAspMetLeuValAspProPheGlyLeu                               51015                                                                          GGAAGTATTGTTCAGGATGGGCTTGTGTTCAGGCAGAATTTTTCGATT576                            GlySerIleValGlnAspGlyLeuValPheArgGlnAsnPheSerIle                               202530                                                                         AGGTCCTATGAAATAGGCGCCGATCGCACTGCGTCTATAGAGACGGTG624                            ArgSerTyrGluIleGlyAlaAspArgThrAlaSerIleGluThrVal                               354045                                                                         ATGAACCATTTGCAGGAAACAGCTCTCAATCATGTTAAGATTGCTGGG672                            MetAsnHisLeuGlnGluThrAlaLeuAsnHisValLysIleAlaGly                               50556065                                                                       CTTTCTAATGACGGCTTTGGTCGTACTCCTGAGATGTATAAAAGGGAC720                            LeuSerAsnAspGlyPheGlyArgThrProGluMetTyrLysArgAsp                               707580                                                                         CTTATTTGGGTTGTTGCAAAAATGCAGGTCATGGTTAACCGCTATCCT768                            LeuIleTrpValValAlaLysMetGlnValMetValAsnArgTyrPro                               859095                                                                         ACTTGGGGTGACACGGTTGAAGTGAATACTTGGGTTGCCAAGTCAGGG816                            ThrTrpGlyAspThrValGluValAsnThrTrpValAlaLysSerGly                               100105110                                                                      AAAAATGGTATGCGTCGTGACTGGCTCATAAGTGATTGTAATACTGGA864                            LysAsnGlyMetArgArgAspTrpLeuIleSerAspCysAsnThrGly                               115120125                                                                      GAGATTCTTACAAGAGCATCAAGCGTGTGGGTCATGATGAATCAAAAG912                            GluIleLeuThrArgAlaSerSerValTrpValMetMetAsnGlnLys                               130135140145                                                                   ACAAGAAGATTGTCAAAAATTCCAGATGAGGTTCGAAATGAGATAGAG960                            ThrArgArgLeuSerLysIleProAspGluValArgAsnGluIleGlu                               150155160                                                                      CCTCATTTTGTGGACTCTCCTCCCGTCATTGAAGATGATGACCGGAAA1008                           ProHisPheValAspSerProProValIleGluAspAspAspArgLys                               165170175                                                                      CTTCCCAAGCTGGATGAGAAGACTGCTGACTCCATCCGCAAGGGTCTA1056                           LeuProLysLeuAspGluLysThrAlaAspSerIleArgLysGlyLeu                               180185190                                                                      ACTCCGAGGTGGAATGACTTGGATGTCAATCAGCACGTCAACAACGTG1104                           ThrProArgTrpAsnAspLeuAspValAsnGlnHisValAsnAsnVal                               195200205                                                                      AAGTACATCGGGTGGATTCTTGAGAGTACTCCACCAGAAGTTCTGGAG1152                           LysTyrIleGlyTrpIleLeuGluSerThrProProGluValLeuGlu                               210215220225                                                                   ACACAGGAGTTATGTTCCCTTACCCTGGAATACAGGCGGGAATGTGGA1200                           ThrGlnGluLeuCysSerLeuThrLeuGluTyrArgArgGluCysGly                               230235240                                                                      AAGGAGAGTGTTCTGGAGTCCCTCACTGCTATGGACCCCTCTGGAGGG1248                           LysGluSerValLeuGluSerLeuThrAlaMetAspProSerGlyGly                               245250255                                                                      GGCTATGGGTCCCAGTTTCAGCACCTTCTGCGGCTTGAGGATGGAGGT1296                           GlyTyrGlySerGlnPheGlnHisLeuLeuArgLeuGluAspGlyGly                               260265270                                                                      GAGATCGTGAAGGGGAGAACCGAGTGGCGAACCCAAGAATGGTGTAAT1344                           GluIleValLysGlyArgThrGluTrpArgThrGlnGluTrpCysAsn                               275280285                                                                      CAATGGGGTGGTACCAACCGGGGAGTCCTCGCCTGGAGACTACTCTTA1392                           GlnTrpGlyGlyThrAsnArgGlyValLeuAlaTrpArgLeuLeuLeu                               290395300305                                                                   GAAGGGGGAGCCCTGACCCCTTTGGAGTTATGCTTTCTTTATTGTCGG1440                           GluGlyGlyAlaLeuThrProLeuGluLeuCysPheLeuTyrCysArg                               310315320                                                                      ACGAGCTGAGTGAAGGGCAGGTAAGATAGTAGCAATCGGTAGATTGTGTAGTTTGT1496                   ThrSer                                                                         TTGCTGCTTTTCACGATGGCTCTCGTGTATAATATCATGGTCGTCTTCTTTGTATCCTCT1556               TCGCATGTTCCGGGTTGATTTATACATTATATTCTTTCTAAAAAA1601                              (2) INFORMATION FOR SEQ ID NO:7:                                               (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 1744 base pairs                                                    (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: cDNA ro mRNA                                               (xi) SEQUENCE DESCRIPTION: SEQ ID NO:7:                                        CTTTGATCGGTCGATCCTTTCCTCTCGCTCATAATTTACCCATTAGTCCCCTTTGCCTTC60                 TTTAAACCCTCCTTTCCTTTCTCTTCCCTTCTTCCTCTCTGGGAAGTTTAAAGCTTTTGC120                CTTTCTCCCCCCCACAACCTCTTTCCCGCATTTGTTGAGCTGTTTTTTTGTCGCCATTCG180                TCCTCTCCTCTTCAGTTCAACAGAAATGGTGGCTACCGCTGCAAGTTCTGCA232                        MetValAlaThrAlaAlaSerSerAla                                                    15                                                                             TTCTTCCCCCTCCCATCCGCCGACACCTCATCGAGACCCGGAAAGCTC280                            PhePheProLeuProSerAlaAspThrSerSerArgProGlyLysLeu                               10152025                                                                       GGCAATAAGCCATCGAGCTTGAGCCCCCTCAAGCCCAAATCGACCCCC328                            GlyAsnLysProSerSerLeuSerProLeuLysProLysSerThrPro                               303540                                                                         AATGGCGGTTTGCAGGTTAAGGCAAATGCCAGTGCCCCTCCTAAGATC376                            AsnGlyGlyLeuGlnValLysAlaAsnAlaSerAlaProProLysIle                               455055                                                                         AATGGTTCCCCGGTCGGTCTAAAGTCGGGCGGTCTCAAGACTCAGGAA424                            AsnGlySerProValGlyLeuLysSerGlyGlyLeuLysThrGlnGlu                               606570                                                                         GACGCTCATTCGGCCCCTCCTCCGCGAACTTTTATCAACCAGTTGCCT472                            AspAlaHisSerAlaProProProArgThrPheIleAsnGlnLeuPro                               758085                                                                         GATTGGAGTATGCTTCTTGCTGCAATCACGACTGTCTTCTTGGCTGCA520                            AspTrpSerMetLeuLeuAlaAlaIleThrThrValPheLeuAlaAla                               9095100105                                                                     GAGAAGCAATGGATGATGCTTGATTGGAAACCTAAGAGGCCTGACATG568                            GluLysGlnTrpMetMetLeuAspTrpLysProLysArgProAspMet                               110115120                                                                      CTTGTGGACCCGTTTGGATTGGGAAGTATTGTTCAGGATGGGCTTGTG616                            LeuValAspProPheGlyLeuGlySerIleValGlnAspGlyLeuVal                               125130135                                                                      TTCAGGCAGAATTTTTCGATTAGGTCCTATGAAATAGGCGCCGATCGC664                            PheArgGlnAsnPheSerIleArgSerTyrGluIleGlyAlaAspArg                               140145150                                                                      ACTGCGTCTATAGAGACGGTGATGAACCATTTGCAGGAAACAGCTCTC712                            ThrAlaSerIleGluThrValMetAsnHisLeuGlnGluThrAlaLeu                               155160165                                                                      AATCATGTTAAGATTGCTGGGCTTTCTAATGACGGCTTTGGTCGTACT760                            AsnHisValLysIleAlaGlyLeuSerAsnAspGlyPheGlyArgThr                               170175180185                                                                   CCTGAGATGTATAAAAGGGACCTTATTTGGGTTGTTGCGAAAATGCAA808                            ProGluMetTyrLysArgAspLeuIleTrpValValAlaLysMetGln                               190195200                                                                      GTCATGGTTAACCGCTATCCTACTTGGGGTGACACGGTTGAAGTGAAT856                            ValMetValAsnArgTyrProThrTrpGlyAspThrValGluValAsn                               205210215                                                                      ACTTGGGTTGCCAAGTCAGGGAAAAATGGTATGCGTCGTGACTGGCTC904                            ThrTrpValAlaLysSerGlyLysAsnGlyMetArgArgAspTrpLeu                               220225230                                                                      ATAAGTGATTGCAATACTGGAGAGATTCTTACAAGAGCATCAAGCGTG952                            IleSerAspCysAsnThrGlyGluIleLeuThrArgAlaSerSerVal                               235240245                                                                      TGGGTCATGATGAATCAAAAGACAAGAAGATTGTCAAAAATTCCAGAT1000                           TrpValMetMetAsnGlnLysThrArgArgLeuSerLysIleProAsp                               250255260265                                                                   GAGGTTCGAAATGAGATAGAGCCTCATTTTGTGGACTCTCCTCCCGTC1048                           GluValArgAsnGluIleGluProHisPheValAspSerProProVal                               270275280                                                                      ATTGAAGACGATGACCGGAAACTTCCCAAGCTGGATGAGAAGACTGCT1096                           IleGluAspAspAspArgLysLeuProLysLeuAspGluLysThrAla                               285290295                                                                      GACTCCATCCGCAAGGGTCTAACTCCGAGGTGGAATGACTTGGATGTC1144                           AspSerIleArgLysGlyLeuThrProArgTrpAsnAspLeuAspVal                               300305310                                                                      AATCAACACGTCAACAACGTGAAGTACATCGGGTGGATTCTTGAGAGT1192                           AsnGlnHisValAsnAsnValLysTyrIleGlyTrpIleLeuGluSer                               315320325                                                                      ACTCCACCAGAAGTTCTGGAGACCCAGGAGTTATGTTCCCTTACTCTG1240                           ThrProProGluValLeuGluThrGlnGluLeuCysSerLeuThrLeu                               330335340345                                                                   GAATACAGGCGGGAATGTGGAAGGGAGAGCGTGCTGGAGTCCCTCACT1288                           GluTyrArgArgGluCysGlyArgGluSerValLeuGluSerLeuThr                               350355360                                                                      GCTATGGATCCCTCTGGAGGGGGTTATGGGTCCCAGTTTCAGCACCTT1336                           AlaMetAspProSerGlyGlyGlyTyrGlySerGlnPheGlnHisLeu                               365370375                                                                      CTGCGGCTTGAGGATGGAGGTGAGATCGTGAAGGGGAGAACTGAGTGG1384                           LeuArgLeuGluAspGlyGlyGluIleValLysGlyArgThrGluTrp                               380385390                                                                      CGGCCCAAGAATGGTGTAATCAATGGGGTGGTACCAACCGGGGAGTCC1432                           ArgProLysAsnGlyValIleAsnGlyValValProThrGlyGluSer                               395400405                                                                      TCACCTGGAGACTACTCTTAGAAGGGAGCCCTGACCCCTTTGGAGTTG1480                           SerProGlyAspTyrSer                                                             410415                                                                         TGATTTCTTTATTGTCGGACGAGCTAAGTGAAGGGCAGGTAAGATAGTAGCAATCGGTAG1540               ATTGTGTAGTTTGTTTGCTGCTTTTTCACGATGGCTCTCGTGTATAATATCATGGTCTGT1600               CTTCTTTGTATCCTCTTCTTCGCATGTTCCGGGTTGATTCATACATTATATTCTTTCTAT1660               TTGTTTGAAGGCGAGTAGCGGGTTGTAATTATTTATTTTGTCATTACAATGTCGTTTAAC1720               TTTTCAAATGAAACTACTTATGTG1744                                                   (2) INFORMATION FOR SEQ ID NO:8:                                               (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 1474 base pairs                                                    (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: cDNA to mRNA                                               (xi) SEQUENCE DESCRIPTION: SEQ ID NO:8:                                        CTGGATACCATTTTCCCTGCGAAAAAACATGGTGGCTGCTGCAGCAAGTTCC52                         MetValAlaAlaAlaAlaSerSer                                                       15                                                                             GCATTCTTCCCTGTTCCAGCCCCGGGAGCCTCCCCTAAACCCGGGAAG100                            AlaPhePheProValProAlaProGlyAlaSerProLysProGlyLys                               101520                                                                         TTCGGAAATTGGCCCTCGAGCTTGAGCCCTTCCTTCAAGCCCAAGTCA148                            PheGlyAsnTrpProSerSerLeuSerProSerPheLysProLysSer                               25303540                                                                       ATCCCCAATGGCGGATTTCAGGTTAAGGCAAATGACAGCGCCCATCCA196                            IleProAsnGlyGlyPheGlnValLysAlaAsnAspSerAlaHisPro                               455055                                                                         AAGGCTAACGGTTCTGCAGTTAGTCTAAAGTCTGGCAGCCTCAACACT244                            LysAlaAsnGlySerAlaValSerLeuLysSerGlySerLeuAsnThr                               606570                                                                         CAGGAGGACACTTCGTCGTCCCCTCCTCCTCGGACTTTCCTTCACCAG292                            GlnGluAspThrSerSerSerProProProArgThrPheLeuHisGln                               758085                                                                         TTGCCTGATTGGAGTAGGCTTCTGACTGCAATCACGACCGTGTTCGTG340                            LeuProAspTrpSerArgLeuLeuThrAlaIleThrThrValPheVal                               9095100                                                                        AAATCTAAGAGGCCTGACATGCATGATCGGAAATCCAAGAGGCCTGAC388                            LysSerLysArgProAspMetHisAspArgLysSerLysArgProAsp                               105110115120                                                                   ATGCTGGTGGACTCGTTTGGGTTGGAGAGTACTGTTCAGGATGGGCTC436                            MetLeuValAspSerPheGlyLeuGluSerThrValGlnAspGlyLeu                               125130135                                                                      GTGTTCCGACAGAGTTTTTCGATTAGGTCTTATGAAATAGGCACTGAT484                            ValPheArgGlnSerPheSerIleArgSerTyrGluIleGlyThrAsp                               140145150                                                                      CGAACGGCCTCTATAGAGACACTTATGAACCACTTGCAGGAAACATCT532                            ArgThrAlaSerIleGluThrLeuMetAsnHisLeuGlnGluThrSer                               155160165                                                                      CTCAATCATTGTAAGAGTACCGGTATTCTCCTTGACGGCTTCGGTCGT580                            LeuAsnHisCysLysSerThrGlyIleLeuLeuAspGlyPheGlyArg                               170175180                                                                      ACTCTTGAGATGTGTAAAAGGGACCTCATTTGGGTGGTAATAAAAATG628                            ThrLeuGluMetCysLysArgAspLeuIleTrpValValIleLysMet                               185190195200                                                                   CAGATCAAGGTGAATCGCTATCCAGCTTGGGGCGATACTGTCGAGATC676                            GlnIleLysValAsnArgTyrProAlaTrpGlyAspThrValGluIle                               205210215                                                                      AATACCCGGTTCTCCCGGTTGGGGAAAATCGGTATGGGTCGCGATTGG724                            AsnThrArgPheSerArgLeuGlyLysIleGlyMetGlyArgAspTrp                               220225230                                                                      CTAATAAGTGATTGCAACACAGGAGAAATTCTTGTAAGAGCTACGAGC772                            LeuIleSerAspCysAsnThrGlyGluIleLeuValArgAlaThrSer                               235240245                                                                      GCGTATGCCATGATGAATCAAAAGACGAGAAGACTCTCAAAACTTCCA820                            AlaTyrAlaMetMetAsnGlnLysThrArgArgLeuSerLysLeuPro                               250255260                                                                      TACGAGGTTCACCAGGAGATAGTGCCTCTTTTTGTCGACTCTCCTGTC868                            TyrGluValHisGlnGluIleValProLeuPheValAspSerProVal                               265270275280                                                                   ATTGAAGACAGTGATCTGAAAGTGCATAAGTTTAAAGTGAAGACTGGT916                            IleGluAspSerAspLeuLysValHisLysPheLysValLysThrGly                               285290295                                                                      GATTCCATTCAAAAGGGTCTAACTCCGGGGTGGAATGACTTGGATGTC964                            AspSerIleGlnLysGlyLeuThrProGlyTrpAsnAspLeuAspVal                               300305310                                                                      AATCAGCACGTAAGCAACGTGAAGTACATTGGGTGGATTCTCGAGAGT1012                           AsnGlnHisValSerAsnValLysTyrIleGlyTrpIleLeuGluSer                               315320325                                                                      ATGCCAACAGAAGTTTTGGAGACCCAGGAGCTATGCTCTCTCGCCCTT1060                           MetProThrGluValLeuGluThrGlnGluLeuCysSerLeuAlaLeu                               330335340                                                                      GAATATAGGCGGGAATGCGGAAGGGACAGTGTGCTGGAGTCCGTGACC1108                           GluTyrArgArgGluCysGlyArgAspSerValLeuGluSerValThr                               345350355360                                                                   GCTATGGATCCCTCAAAAGTTGGAGTCCGTTCTCAGTACCAGCACCTT1156                           AlaMetAspProSerLysValGlyValArgSerGlnTyrGlnHisLeu                               365370375                                                                      CTGCGGCTTGAGGATGGGACTGCTATCGTGAACGGTGCAACTGAGTGG1204                           LeuArgLeuGluAspGlyThrAlaIleValAsnGlyAlaThrGluTrp                               380385390                                                                      CGGCCGAAGAATGCAGGAGCTAACGGGGCGATATCAACGGGAAAGACT1252                           ArgProLysAsnAlaGlyAlaAsnGlyAlaIleSerThrGlyLysThr                               395400405                                                                      TCAAATGGAAACTCGGTCTCTTAGAAGTGTCTCGGAACCCTTCCGAGATGT1303                        SerAsnGlyAsnSerValSer                                                          410415                                                                         GCATTTCTTTTCTCCTTTTCATTTTGTGGTGAGCTGAAAGAAGAGCATGTCGTTGCAATC1363               AGTAAATTGTGTAGTTCGTTTTTCGCTTTGCTTCGCTCCTTTGTATAATAATATGGTCAG1423               TCGTCTTTGTATCATTTCATGTTTTCAGTTTATTTACGCCATATAATTTTT1474                        (2) INFORMATION FOR SEQ ID NO:9:                                               (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 976 base pairs                                                     (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: cDNA to mRNA                                               (xi) SEQUENCE DESCRIPTION: SEQ ID NO:9:                                        GGCACGAGAAACATGGTGGCTGCCGCAGCAAGTTCTGCATTCTTCTCCGTTCCAACCCCG60                 GGAATCTCCCCTAAACCCGGGAAGTTCGGTAATGGTGGCTTTCAGGTTAAGGCAAACGCC120                AATGCCCATCCTAGTCTAAAGTCTGGCAGCCTCGAGACTGAAGATGACACTTCATCGTCG180                TCCCCTCCTCCTCGGACTTTCATTAACCAGTTGCCCGACTGGAGTATGCTTCTGTCCGCA240                ATCACGACTATCTTCGGGGCAGCTGAGAAGCAGTGGATGATGCTTGATAGGAAATCTAAG300                NAGACCCGACATGCTCATGGCAACCGTTTGGGGTTGACAGTATTGTTCAGGATGGGGTTT360                TTTTCAGACAGAGTTTTTCGATTAGATCTTACGAAATAGGCGCTGATCGAACAACCTCAA420                TAGAGACGCTGATGAACATGTTCCAGGAAACGTCTTTGAATCATTGTAAGAGTAACGGTC480                TTCTCAATGACGGCTTTGGTCGCACTCCTGAGATGTGTAAGAAGGGCCTCATTTGGGTGG540                TTACGAAAATGCAGGTCGAGGTGAATCGCTATCCTATTTGGSGTGATTCTATCGAAGTCA600                ATACTTGGGTCTCCGAGTCGGGGNAAAANCGGTATGGGTCGTGATTGGCTGATAAGTGAT660                TGCAGTACAGGAGNAAATTCTTGTAAGAGCAACGAGCGTGTGGGCTATGATGAATCAAAA720                GACGAGAAGATTGTCAAAATTTCCATTTGAGGTTCGACAAGAGATAGCGCCTAATTTTGT780                CGACTCTGTTCCTGTCATTGAAGACGATCGAAAATTACACAAGCTTGATGTGAAGACGGG840                TGATTCCATTCACAATGGTCTAACTCCAAGGTGGAATGACTTGGATGTCAATCAGCACGT900                TAACAATGTGAAATACATTGGGTGGATTCTCAAGAGTGTTCCAACAGATGTTTTTGGGGC960                CCAGGAGCTATGTGGA976                                                            (2) INFORMATION FOR SEQ ID NO:10:                                              (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 1670 base pairs                                                    (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: cDNA to mRNA                                               (xi) SEQUENCE DESCRIPTION: SEQ ID NO:10:                                       GAATTCGGCACGAGTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTC60                 TCTCCCCAACGAAATTTCAATTCCATTAGCTGTTGACAAAAACAGCTGAAGATCACAAAT120                TTGTTCTCAGAGGAAGAAAAGGAAGGAAGGAAGGAAGGAGGAGGAAGCCATTGTGGGCAA180                TATTTGATCGGTGGATCCTTTCCTCCCGCTCGTTGAAAGACAATGGTGGCTACCGCTGCA240                AGCTCTGCATTCTTCCCCGTGTCGTCCCCGGTCACCTCCTCTAGACCAGGAAAGCCCGGA300                AATGGGTCATCGAGCTTCAGCCCCATCAAGCCCAAATTTGTCGCCAATGGCGGGTTGCAG360                GTTAAGGCAAACGCCAGTGCCCCTCCTAAGATCAATGGTTCCTCGGTCGGTCTAAAGTCC420                TGCAGTCTCAAGACTCAGGAAGACACTCCTTCGGCCCCTGCTCCACGGACTTTTATCAAC480                CAGTTGCCTGATTGGAGTATGCTTCTTGCTGCAATTACTACTGTCTTCTTGGCAGCAGAG540                AAGCAGTGGATGATGCTTGATTGGAAACCTAAGAGGCCTGACATGCTTGTGGACCCGTTC600                GGATTGGGAAGTATTGTCCAGCATGGGCTTGTGTTCAGGCAGAATTTTTCGATTAGGTCC660                TATGAAATAGGCGCTGATCGCACTGCGTCTATAGAGACGGTGATGAACCACTTGCAGGAA720                ACGGCTCTCAATCATGTTAAGAGTGCGGGGCTTATGAATGACGGCTTTGGTCGTACTCCT780                GAGATGTATAAAAAGGACCTTATTTGGGTTGTCGCGAAAATGCAGGTCATGGTTAACCGC840                TATCCTACTTGGGGTGACACAGTTGAAGTGAATACTTGGGTTGCCAAGTCAGGGAAAAAT900                GGTATGCGTCGTGATTGGCTCATAAGTGATTGTAATACAGGAGAAATTCTTACTAGAGCA960                TCAAGCGTGTGGGTCATGATGAATCAAAAGACAAGAAGATTGTCAAAAATTCCAGATGAG1020               GTTCGGCATGAGATTGAGCCTCATTTTGTGGACTCTCCTCCCGTCATTGAAGACGATGAC1080               CGAAAACTTCCCAAGCTGGATGACAAGACTGCTGACTCCATCCGCAAGGGTCTAACTCCG1140               AAGTGGAATGACTTGGATGTCAATCAGCACGTCAACAACGTGAAGTACATCGGGTGGATT1200               CTTGAGAGTACTCCACAAGAAGTTCTGGAGACCCAGGAGCTATGTTCCCTTACCCTGGAA1260               TACAGGCGGGAATGCGGAAGGGAGAGCGTGCTGGAGTCCCTCACTGCTGCGGACCCCTCT1320               GGAAAGGGCTTTGGGTCCCAGTTCCAGCACCTTCTGAGGCTTGAGGATGGAGGGGAGATT1380               GTGAAGGGGAGAACTGAGTGGCGACCAAAGACTGCAGGTATCAATGGGGCGATACCATCC1440               GGGGAGACCTCACCTGGAGACTCTTAGAAGGGAGCCCTGGTCCCTTTGGAGTTCTGCTTT1500               CTTTATGGTCGGATGAGCTGAGTGAACTGCAGGTAAGGTAGTAGCAATCGGTAGATTGTT1560               TAGTTTGTTTGCTGTTTTTTACTCCGGCTCTCTTTTATAATGTCATGGTCTCATTTGTAT1620               CCTCACATGTTTCGGGTTGATTTATACAATATATTATTTCTATTTGTTTC1670                         (2) INFORMATION FOR SEQ ID NO:11:                                              (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 1310 base pairs                                                    (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: cDNA to mRNA                                               (xi) SEQUENCE DESCRIPTION: SEQ ID NO:11:                                       GGCACGAGTGCCTCTTCTCCATCTCGTCCTCCCCACATACTGAGCCACCCAGAGAGAGAA60                 CCCAGCCGCTGTTCCCTCGGAAATGTTGAAGCTTTCTTGCAATGCCGCCACC112                        MetLeuLysLeuSerCysAsnAlaAlaThr                                                 1510                                                                           GACCAGATTCTGTCGTCGGCCGTGGCTCAAACCGCATTATGGGGTCAA160                            AspGlnIleLeuSerSerAlaValAlaGlnThrAlaLeuTrpGlyGln                               152025                                                                         CCCAGAAACAGATCCTTTTCAATGTCCGCCCGGAGAAGGGGAGCCGTT208                            ProArgAsnArgSerPheSerMetSerAlaArgArgArgGlyAlaVal                               303540                                                                         TGCTGCGCGCCTCCAGCTGCTGGAAAGCCCCCTGCCATGACCGCTGTT256                            CysCysAlaProProAlaAlaGlyLysProProAlaMetThrAlaVal                               455055                                                                         ATCCCAAAAGACGGGGTGGCCTCGTCCGGGTCCGGCAGCCTGGCCGAC304                            IleProLysAspGlyValAlaSerSerGlySerGlySerLeuAlaAsp                               606570                                                                         CAGCTGAGGCTCGGGAGCCGTACGCAGAATGGGCTGTCGTACACGGAG352                            GlnLeuArgLeuGlySerArgThrGlnAsnGlyLeuSerTyrThrGlu                               75808590                                                                       AAGTTCATTGTCAGGTGCTACGAGGTCGGTATTAACAAGACAGCCACT400                            LysPheIleValArgCysTyrGluValGlyIleAsnLysThrAlaThr                               95100105                                                                       GTCGAAACCATGGCCAATCTCTTGCAGGAAGTAGGTTGTAACCATGCT448                            ValGluThrMetAlaAsnLeuLeuGlnGluValGlyCysAsnHisAla                               110115120                                                                      CAGAGTGTTGGATTCTCAACTGACGGGTTTGCGACGACGCCTACCATG496                            GlnSerValGlyPheSerThrAspGlyPheAlaThrThrProThrMet                               125130135                                                                      AGGAAATTGAATCTGATATGGGTTACTGCTCGAATGCACATAGAAATT544                            ArgLysLeuAsnLeuIleTrpValThrAlaArgMetHisIleGluIle                               140145150                                                                      TATAAGTACCCAGCATGGAGTGATGTGGTTGAAATCGAGACTTGGTGC592                            TyrLysTyrProAlaTrpSerAspValValGluIleGluThrTrpCys                               155160165170                                                                   CAAAGTGAAGGAAGAATCGGAACAAGAAGGGATTGGATTCTCAAGGAT640                            GlnSerGluGlyArgIleGlyThrArgArgAspTrpIleLeuLysAsp                               175180185                                                                      TATGGTAATGGTGAAGTTATTGGAAGAGCCACAAGCAAGTGGGTGATG688                            TyrGlyAsnGlyGluValIleGlyArgAlaThrSerLysTrpValMet                               190195200                                                                      ATGAACCAGAACACTAGACGACTCCAAAAAGTTGATGATTCCGTTCGA736                            MetAsnGlnAsnThrArgArgLeuGlnLysValAspAspSerValArg                               205210215                                                                      GAAGAGTATATGGTTTTCTGTCCACGCGAACCAAGGTTATCATTTCCT784                            GluGluTyrMetValPheCysProArgGluProArgLeuSerPhePro                               220225230                                                                      GAAGAGAACAATCGGAGTTTGAGAAAAATATCTAAATTGGAAGATCCT832                            GluGluAsnAsnArgSerLeuArgLysIleSerLysLeuGluAspPro                               235240245250                                                                   GCTGAGTATTCGAGACTTGGTCTTACGCCTAGAAGAGCTGATCTGGAT880                            AlaGluTyrSerArgLeuGlyLeuThrProArgArgAlaAspLeuAsp                               255260265                                                                      ATGAACCAGCATGTCAACAACGTTGCTTACATAGGTTGGGCTCTGGAG928                            MetAsnGlnHisValAsnAsnValAlaTyrIleGlyTrpAlaLeuGlu                               270275280                                                                      AGTGTACCTCAAGAAATAATCGACTCTTATGAGCTGGAAACTATCACT976                            SerValProGlnGluIleIleAspSerTyrGluLeuGluThrIleThr                               285290295                                                                      CTGGACTACAGAAGAGAATGCCAACAGGATGACGTAGTCGATTCGCTC1024                           LeuAspTyrArgArgGluCysGlnGlnAspAspValValAspSerLeu                               300305310                                                                      ACCAGTGTTCTGTCAGATGAGGAATCAGGAACATTACCAGAGCTCAAG1072                           ThrSerValLeuSerAspGluGluSerGlyThrLeuProGluLeuLys                               315320325330                                                                   GGAACAAATGGATCTGCATCCACCCCACTGAAACGTGACCATGATGGC1120                           GlyThrAsnGlySerAlaSerThrProLeuLysArgAspHisAspGly                               335340345                                                                      TCTCGCCAGTTCTTGCACTTGCTGAGGCTCTCCCCCGACGGGCTAGAA1168                           SerArgGlnPheLeuHisLeuLeuArgLeuSerProAspGlyLeuGlu                               350355360                                                                      ATAAACCGTGGCCGAACTGAATGGAGAAAGAAATCCACGAAA1210                                 IleAsnArgGlyArgThrGluTrpArgLysLysSerThrLys                                     365370375                                                                      TAGAGGAGTCTCTTACATCCTGCCCATCTGGTTTGATCTGCATATGGTATTTTCCCTTGC1270               ACGCTTTTGCTTCCTGTTTATTTGAGTTTGATTCAGCACC1310                                   (2) INFORMATION FOR SEQ ID NO:12:                                              (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 1307 base pairs                                                    (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: cDNA to mRNA                                               (xi) SEQUENCE DESCRIPTION: SEQ ID NO:12:                                       GCTCGCCTCCCACATTTTCTTCTTCGATCCCGAAAAGATGTTGAAGCTCTCGTGT55                      MetLeuLysLeuSerCys                                                             15                                                                             AATGCGACTGATAAGTTACAGACCCTCTTCTCGCATTCTCATCAACCG103                            AsnAlaThrAspLysLeuGlnThrLeuPheSerHisSerHisGlnPro                               101520                                                                         GATCCGGCACACCGGAGAACCGTCTCCTCCGTGTCGTGCTCTCATCTG151                            AspProAlaHisArgArgThrValSerSerValSerCysSerHisLeu                               253035                                                                         AGGAAACCGGTTCTCGATCCTTTGCGAGCGATCGTATCTGCTGATCAA199                            ArgLysProValLeuAspProLeuArgAlaIleValSerAlaAspGln                               404550                                                                         GGAAGTGTGATTCGAGCAGAACAAGGTTTGGGCTCACTCGCGGATCAG247                            GlySerValIleArgAlaGluGlnGlyLeuGlySerLeuAlaAspGln                               55606570                                                                       CTCCGATTGGGTAGCTTGACGGAGGATGGTTTGTCGTATAAGGAGAAG295                            LeuArgLeuGlySerLeuThrGluAspGlyLeuSerTyrLysGluLys                               758085                                                                         TTCATCGTCAGATCCTACGAAGTGGGGAGTAACAAGACCGCCACTGTC343                            PheIleValArgSerTyrGluValGlySerAsnLysThrAlaThrVal                               9095100                                                                        GAAACCGTCGCTAATCTTTTGCAGGAGGTGGGATGTAATCATGCGCAG391                            GluThrValAlaAsnLeuLeuGlnGluValGlyCysAsnHisAlaGln                               105110115                                                                      AGCGTTGGATTCTCGACTGATGGGTTTGCGACAACACCGACCATGAGG439                            SerValGlyPheSerThrAspGlyPheAlaThrThrProThrMetArg                               120125130                                                                      AAACTGCATCTCATTTGGGTCACTGCGAGAATGCATATAGAGATCTAC487                            LysLeuHisLeuIleTrpValThrAlaArgMetHisIleGluIleTyr                               135140145150                                                                   AAGTACCCTGCTTGGGGTGATGTGGTTGAGATAGAGACATGGTGTCAG535                            LysTyrProAlaTrpGlyAspValValGluIleGluThrTrpCysGln                               155160165                                                                      AGTGAAGGAAGGATCGGGACTAGGCGTGATTGGATTCTTAAGGATGTT583                            SerGluGlyArgIleGlyThrArgArgAspTrpIleLeuLysAspVal                               170175180                                                                      GCTACGGGTGAAGTCACTGGCCGTGCTACAAGCAAGTGGGTGATGATG631                            AlaThrGlyGluValThrGlyArgAlaThrSerLysTrpValMetMet                               185190195                                                                      AACCAAGACACAAGACGGCTTCAGAAAGTTTCTGATGATGTTCGGGAC679                            AsnGlnAspThrArgArgLeuGlnLysValSerAspAspValArgAsp                               200205210                                                                      GAGTACTTGGTCTTCTGTCCTAAAGAACTCAGATTAGCATTTCCTGAG727                            GluTyrLeuValPheCysProLysGluLeuArgLeuAlaPheProGlu                               215220225230                                                                   GAGAATAACAGAAGCTTGAAGAAAATTCCGAAACTCGAAGATCCAGCT775                            GluAsnAsnArgSerLeuLysLysIleProLysLeuGluAspProAla                               235240245                                                                      CAGTATTCGATGATTGGGCTTAAGCCTAGACGAGCTGATCTCGACATG823                            GlnTyrSerMetIleGlyLeuLysProArgArgAlaAspLeuAspMet                               250255260                                                                      AACCAGCATGTCAATAATGTCACCTATATTGGATGGGTTCTTGAGAGC871                            AsnGlnHisValAsnAsnValThrTyrIleGlyTrpValLeuGluSer                               265270275                                                                      ATACCTCAAGAGATTGTAGACACGCACGAACTTCAGGTCATAACTCTG919                            IleProGlnGluIleValAspThrHisGluLeuGlnValIleThrLeu                               280285290                                                                      GATTACAGAAGAGAATGTCAACAAGACGATGTGGTGGATTCACTCACC967                            AspTyrArgArgGluCysGlnGlnAspAspValValAspSerLeuThr                               295300305310                                                                   ACTACCACCTCAGAGATTGGTGGGACCAATGGCTCTGCATCATCAGGC1015                           ThrThrThrSerGluIleGlyGlyThrAsnGlySerAlaSerSerGly                               315320325                                                                      ACACAGGGGCAAAACGATAGCCAGTTCTTACATCTCTTAAGGCTGTCT1063                           ThrGlnGlyGlnAsnAspSerGlnPheLeuHisLeuLeuArgLeuSer                               330335340                                                                      GGAGACGGTCAGGAGATCAACCGCGGGACAACCCTGTGGAGAAAGAAG1111                           GlyAspGlyGlnGluIleAsnArgGlyThrThrLeuTrpArgLysLys                               345350355                                                                      CCCTCCAATCTCTAAGCCATTTCGTTCTTAAGTTTCCTCTATCTGTGTCGCT1163                       ProSerAsnLeu                                                                   360                                                                            CGATGCTTCACGAGTCTAGTCAGGTCTCATTTTTTTCAATCTAAATTTGGGTTAGACTAG1223               AGAACTGGAATTATTGGAATTTATGAGTTTTCGTTCTTGTTTCTGTACAAATCTTGAGGA1283               TTGAAGCCAAACCCATTTCATCTT1307                                                   (2) INFORMATION FOR SEQ ID NO:13:                                              (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 100 base pairs                                                     (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: other nucleic acid                                         (A) DESCRIPTION: synthetic oligonucleotides                                    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:13:                                       CGGTCTAGATAACAATCAATGCAAGACTATTGCACACGTGTTGCGTGTGAACAATGGTCA60                 GGAGCTTCACGTCTGGGAAACGCCCCCAAAAGAAAACGTG100                                    (2) INFORMATION FOR SEQ ID NO:14:                                              (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 100 base pairs                                                     (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: synthetic oligonucleotides                                 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:14:                                       ATACTCGGCCAATCCAGCGAAGTGGTCCATTCTTCTGGCGAAACCAGAAGCAATCAAAAT60                 GGTGTTGTTTTTAAAAGGCACGTTTTCTTTTGGGGGCGTT100                                    __________________________________________________________________________ 

What is claimed is:
 1. A recombinant DNA construct comprising a plant medium-chain preferring acyl-ACP thioesterase encoding sequence, wherein said thioesterase has hydrolysis activity towards C8 and C10 fatty acyl-ACP substrates.
 2. The construct of claim 1 encoding a precursor plant medium-chain preferring acyl-ACP thioesterase.
 3. The construct of claim 1 wherein said plant is Cuphea hookeriana.
 4. A recombinant DNA construct comprising an expression cassette capable of producing a plant medium-chain preferring acyl-ACP thioesterase in a host cell, wherein said construct comprises, in the 5' to 3' direction of transcription, a transcriptional initiation regulatory region functional in said host cell, a translational initiation regulatory region functional in said host cell, a DNA sequence encoding a biologically active plant medium-chain preferring acyl-ACP thioesterase having activity towards C8 and C10 fatty acyl-ACP substrates, and a transcriptional and translational termination regulatory region functional in said host cell, wherein said plant thioesterase encoding sequence is under the control of said regulatory regions.
 5. The construct of claim 4 wherein said host cell is a plant cell.
 6. The construct of claim 5 wherein said transcriptional initiation region is obtained from a gene preferentially expressed in plant seed tissue.
 7. The construct of claim 4 wherein said sequence is obtainable from Cuphea hookeriana.
 8. The construct of claim 4 wherein said sequence is from a Cuphea hookeriana CUPH-2 thioesterase gene.
 9. A host cell comprising a plant thioesterase encoding sequence construct of claim
 4. 10. The cell of claim 9 wherein said cell is a plant cell.
 11. The cell of claim 10 wherein said plant cell is a Brassica plant cell.
 12. A transgenic host cell comprising an expressed plant medium-chain preferring acyl-ACP thioesterase having activity towards C8 and C10 fatty acyl-ACP substrates.
 13. The cell of claim 12 wherein said host cell is a plant cell.
 14. A method of producing C8 and C10 fatty acids in a plant host cell, wherein said method comprises:growing a plant cell having integrated into its genome a DNA construct, said construct comprising in the 5' to 3' direction of transcription, a transcriptional regulatory region functional in said plant cell and a plant medium-chain preferring acyl-ACP thioesterase encoding sequence, under conditions which will permit the expression of said plant thioesterase, wherein said plant thioesterase has activity towards C8 and C10 fatty acyl-ACP substrates.
 15. The method of claim 14 wherein said plant cell is an oilseed embryo plant cell.
 16. The method of claim 14 wherein said plant thioesterase encoding sequence is obtainable from Cuphea hookeriana.
 17. The method of claim 14 wherein said plant thioesterase encoding sequence is from a Cuphea hookeriana CUPH-2 thioesterase gene.
 18. A plant cell having a modified fatty acid composition produced according to the method of claim
 14. 19. The construct of claim 8 wherein said sequence encodes the Cuphea hookeriana acyl-ACP thioesterase protein shown in FIG. 7 (SEQ ID NO:8).
 20. The construct of claim 8 wherein said sequence comprises the Cuphea hookeriana acyl-ACP thioesterase encoding sequence shown in FIG. 7 (SEQ ID NO:8). 